Related
Here is what I am using:
class something
{
char flags[26][80];
} a;
std::fill(&a.flags[0][0], &a.flags[0][0] + 26 * 80, 0);
(Update: I should have made it clear earlier that I am using this inside a class.)
The simple way to initialize to 0 the array is in the definition:
char flags[26][80] = {};
If you want to use std::fill, or you want to reset the array, I find this a little better:
char flags[26][80];
std::fill( &flags[0][0], &flags[0][0] + sizeof(flags) /* / sizeof(flags[0][0]) */, 0 );
The fill expressed in terms of the array size will allow you to change the dimensions and keep the fill untouched. The sizeof(flags[0][0]) is 1 in your case (sizeof(char)==1), but you might want to leave it there in case you want to change the type at any point.
In this particular case (array of flags --integral type) I could even consider using memset even if it is the least safe alternative (this will break if the array type is changed to a non-pod type):
memset( &flags[0][0], 0, sizeof(flags) );
Note that in all three cases, the array sizes are typed only once, and the compiler deduces the rest. That is a little safer as it leaves less room for programmer errors (change the size in one place, forget it in the others).
EDIT: You have updated the code, and as it is it won't compile as the array is private and you are trying to initialize it externally. Depending on whether your class is actually an aggregate (and want to keep it as such) or whether you want to add a constructor to the class you can use different approaches.
const std::size_t rows = 26;
const std::size_t cols = 80;
struct Aggregate {
char array[rows][cols];
};
class Constructor {
public:
Constructor() {
std::fill( &array[0][0], &array[rows][0], 0 ); // [1]
// memset( array, 0, sizeof(array) );
}
private:
char array[rows][cols];
};
int main() {
Aggregate a = {};
Constructor b;
}
Even if the array is meant to be public, using a constructor might be a better approach as it will guarantee that the array is properly initialized in all instances of the class, while the external initialization depends on user code not forgetting to set the initial values.
[1] As #Oli Charlesworth mentioned in a comment, using constants is a different solution to the problem of having to state (and keep in synch) the sizes in more than one place. I have used that approach here with a yet different combination: a pointer to the first byte outside of the bidimensional array can be obtained by requesting the address of the first column one row beyond the bidimensional array. I have used this approach just to show that it can be done, but it is not any better than others like &array[0][0]+(rows*cols)
What is the safe way to fill multidimensional array using std::fill?
The easy default initialization would be using braced inilization.
char flags[26][80]{};
The above will initialize all the elements in the flags to default char.
2-D Array filling using std::fill or std::fill_n
However, in order to provide different value to initialize the above is not enough. The options are std::fill and std::fill_n. (Assuming that the array flags is public in your class)
std::fill(
&a.flags[0][0],
&a.flags[0][0] + sizeof(a.flags) / sizeof(a.flags[0][0]),
'0');
// or using `std::fill_n`
// std::fill_n(&a.flags[0][0], sizeof(a.flags) / sizeof(a.flags[0][0]), '1');
To generalize this for any 2d-array of any type with any initializing value, I would suggest a templated function as follows. This will also avoid the sizeof calculation of the total elements in the array.
#include <algorithm> // std::fill_n, std::fill
#include <cstddef> // std::size_t
template<typename Type, std::size_t M, std::size_t N>
constexpr void fill_2D_array(Type(&arr2D)[M][N], const Type val = Type{}) noexcept
{
std::fill_n(&arr2D[0][0], M * N, val);
// or using std::fill
// std::fill(&arr2D[0][0], &arr2D[0][0] + (M * N ), val);
}
Now you can initialize your flags like
fill_2D_array(a.flags, '0'); // flags should be `public` in your class!
(See Live Online)
3-D Array filling using std::fill or std::fill_n
Adding one more non-template size parameter to the above template function, this can be brought to 3d-arrays as well
#include <algorithm> // std::fill_n
#include <cstddef> // std::size_t
template<typename Type, std::size_t M, std::size_t N, std::size_t O>
constexpr void fill_3D_array(Type(&arr3D)[M][N][O], const Type val = Type{}) noexcept
{
std::fill_n(&arr3D[0][0][0], M * N * O, val);
}
(See Live Online)
it is safe, a two-dimensional array is an array of arrays. Since an array occupied contiguous storage, so the whole multidimensional thing will too. So yeah, it's OK, safe and portable. Assuming you are NOT asking about style, which is covered by other answers (since you're using flags, I strongly recommend std::vector<std::bitset<80> > myFlags(26))
char flags[26][80];
std::fill((char*)flags, (char*)flags + sizeof(flags)/sizeof(char), 0);
Is char[80] supposed to be a substitute for a real string type? In that case, I recommend the following:
std::vector<std::string> flags(26);
flags[0] = "hello";
flags[1] = "beautiful";
flags[2] = "world";
// ...
Or, if you have a C++ compiler that supports initialization lists, for example a recent g++ compiler:
std::vector<std::string> flags { "hello", "beautiful", "world" /* ... */ };
Here's an interesting question about the various quirks of the C++ language. I have a pair of functions, which are supposed to fill an array of points with the corners of a rectangle. There are two overloads for it: one takes a Point[5], the other takes a Point[4]. The 5-point version refers to a closed polygon, whereas the 4-point version is when you just want the 4 corners, period.
Obviously there's some duplication of work here, so I'd like to be able to use the 4-point version to populate the first 4 points of the 5-point version, so I'm not duplicating that code. (Not that it's much to duplicate, but I have terrible allergic reactions whenever I copy and paste code, and I'd like to avoid that.)
The thing is, C++ doesn't seem to care for the idea of converting a T[m] to a T[n] where n < m. static_cast seems to think the types are incompatible for some reason. reinterpret_cast handles it fine, of course, but is a dangerous animal that, as a general rule, is better to avoid if at all possible.
So my question is: is there a type-safe way of casting an array of one size to an array of a smaller size where the array type is the same?
[Edit] Code, yes. I should have mentioned that the parameter is actually a reference to an array, not simply a pointer, so the compiler is aware of the type difference.
void RectToPointArray(const degRect& rect, degPoint(&points)[4])
{
points[0].lat = rect.nw.lat; points[0].lon = rect.nw.lon;
points[1].lat = rect.nw.lat; points[1].lon = rect.se.lon;
points[2].lat = rect.se.lat; points[2].lon = rect.se.lon;
points[3].lat = rect.se.lat; points[3].lon = rect.nw.lon;
}
void RectToPointArray(const degRect& rect, degPoint(&points)[5])
{
// I would like to use a more type-safe check here if possible:
RectToPointArray(rect, reinterpret_cast<degPoint(&)[4]> (points));
points[4].lat = rect.nw.lat; points[4].lon = rect.nw.lon;
}
[Edit2] The point of passing an array-by-reference is so that we can be at least vaguely sure that the caller is passing in a correct "out parameter".
I don't think it's a good idea to do this by overloading. The name of the function doesn't tell the caller whether it's going to fill an open array or not. And what if the caller has only a pointer and wants to fill coordinates (let's say he wants to fill multiple rectangles to be part of a bigger array at different offsets)?
I would do this by two functions, and let them take pointers. The size isn't part of the pointer's type
void fillOpenRect(degRect const& rect, degPoint *p) {
...
}
void fillClosedRect(degRect const& rect, degPoint *p) {
fillOpenRect(rect, p); p[4] = p[0];
}
I don't see what's wrong with this. Your reinterpret-cast should work fine in practice (i don't see what could go wrong - both alignment and representation will be correct, so the merely formal undefinedness won't carry out to reality here, i think), but as i said above i think there's no good reason to make these functions take the arrays by reference.
If you want to do it generically, you can write it by output iterators
template<typename OutputIterator>
OutputIterator fillOpenRect(degRect const& rect, OutputIterator out) {
typedef typename iterator_traits<OutputIterator>::value_type value_type;
value_type pt[] = {
{ rect.nw.lat, rect.nw.lon },
{ rect.nw.lat, rect.se.lon },
{ rect.se.lat, rect.se.lon },
{ rect.se.lat, rect.nw.lon }
};
for(int i = 0; i < 4; i++)
*out++ = pt[i];
return out;
}
template<typename OutputIterator>
OutputIterator fillClosedRect(degRect const& rect, OutputIterator out) {
typedef typename iterator_traits<OutputIterator>::value_type value_type;
out = fillOpenRect(rect, out);
value_type p1 = { rect.nw.lat, rect.nw.lon };
*out++ = p1;
return out;
}
You can then use it with vectors and also with arrays, whatever you prefer most.
std::vector<degPoint> points;
fillClosedRect(someRect, std::back_inserter(points));
degPoint points[5];
fillClosedRect(someRect, points);
If you want to write safer code, you can use the vector way with back-inserters, and if you work with lower level code, you can use a pointer as output iterator.
I would use std::vector or (this is really bad and should not be used) in some extreme cases you can even use plain arrays via pointer like Point* and then you shouldn't have such "casting" troubles.
Why don't you just pass a standard pointer, instead of a sized one, like this
void RectToPointArray(const degRect& rect, degPoint * points ) ;
I don't think your framing/thinking of the problem is correct. You don't generally need to concretely type an object that has 4 vertices vs an object that has 5.
But if you MUST type it, then you can use structs to concretely define the types instead.
struct Coord
{
float lat, long ;
} ;
Then
struct Rectangle
{
Coord points[ 4 ] ;
} ;
struct Pentagon
{
Coord points[ 5 ] ;
} ;
Then,
// 4 pt version
void RectToPointArray(const degRect& rect, const Rectangle& rectangle ) ;
// 5 pt version
void RectToPointArray(const degRect& rect, const Pentagon& pent ) ;
I think this solution is a bit extreme however, and a std::vector<Coord> that you check its size (to be either 4 or 5) as expected with asserts, would do just fine.
I guess you could use function template specialization, like this (simplified example where first argument was ignored and function name was replaced by f(), etc.):
#include <iostream>
using namespace std;
class X
{
};
template<int sz, int n>
int f(X (&x)[sz])
{
cout<<"process "<<n<<" entries in a "<<sz<<"-dimensional array"<<endl;
int partial_result=f<sz,n-1>(x);
cout<<"process last entry..."<<endl;
return n;
}
//template specialization for sz=5 and n=4 (number of entries to process)
template<>
int f<5,4>(X (&x)[5])
{
cout<<"process only the first "<<4<<" entries here..."<<endl;
return 4;
}
int main(void)
{
X u[5];
int res=f<5,5>(u);
return 0;
}
Of course you would have to take care of other (potentially dangerous) special cases like n={0,1,2,3} and you're probably better off using unsigned int's instead of ints.
So my question is: is there a
type-safe way of casting an array of
one size to an array of a smaller size
where the array type is the same?
No. I don't think the language allows you to do this at all: consider casting int[10] to int[5]. You can always get a pointer to it, however, but we can't 'trick' the compiler into thinking a fixed-sized has a different number of dimensions.
If you're not going to use std::vector or some other container which can properly identify the number of points inside at runtime and do this all conveniently in one function instead of two function overloads which get called based on the number of elements, rather than trying to do crazy casts, consider this at least as an improvement:
void RectToPointArray(const degRect& rect, degPoint* points, unsigned int size);
If you're set on working with arrays, you can still define a generic function like this:
template <class T, size_t N>
std::size_t array_size(const T(&/*array*/)[N])
{
return N;
}
... and use that when calling RectToPointArray to pass the argument for 'size'. Then you have a size you can determine at runtime and it's easy enough to work with size - 1, or more appropriate for this case, just put a simple if statement to check if there are 5 elements or 4.
Later if you change your mind and use std::vector, Boost.Array, etc. you can still use this same old function without modifying it. It only requires that the data is contiguous and mutable. You can get fancy with this and apply very generic solutions that, say, only require forward iterators. Yet I don't think this problem is complicated enough to warrant such a solution: it'd be like using a cannon to kill a fly; fly swatter is okay.
If you're really set on the solution you have, then it's easy enough to do this:
template <size_t N>
void RectToPointArray(const degRect& rect, degPoint(&points)[N])
{
assert(N >= 4 && "points requires at least 4 elements!");
points[0].lat = rect.nw.lat; points[0].lon = rect.nw.lon;
points[1].lat = rect.nw.lat; points[1].lon = rect.se.lon;
points[2].lat = rect.se.lat; points[2].lon = rect.se.lon;
points[3].lat = rect.se.lat; points[3].lon = rect.nw.lon;
if (N >= 5)
points[4].lat = rect.nw.lat; points[4].lon = rect.nw.lon;
}
Yeah, there is one unnecessary runtime check but trying to do it at compile time is probably analogous to taking things out of your glove compartment in an attempt to increase your car's fuel efficiency. With N being a compile-time constant expression, the compiler is likely to recognize that the condition is always false when N < 5 and just eliminate that whole section of code.
This question already has answers here:
Initialization of all elements of an array to one default value in C++?
(12 answers)
Closed 4 months ago.
I'm trying to initialize an int array with everything set at -1.
I tried the following, but it doesn't work. It only sets the first value at -1.
int directory[100] = {-1};
Why doesn't it work right?
I'm surprised at all the answers suggesting vector. They aren't even the same thing!
Use std::fill, from <algorithm>:
int directory[100];
std::fill(directory, directory + 100, -1);
Not concerned with the question directly, but you might want a nice helper function when it comes to arrays:
template <typename T, size_t N>
T* end(T (&pX)[N])
{
return pX + N;
}
Giving:
int directory[100];
std::fill(directory, end(directory), -1);
So you don't need to list the size twice.
I would suggest using std::array. For three reasons:
1. array provides runtime safety against index-out-of-bound in subscripting (i.e. operator[]) operations,
2. array automatically carries the size without requiring to pass it separately
3. And most importantly, array provides the fill() method that is required for
this problem
#include <array>
#include <assert.h>
typedef std::array< int, 100 > DirectoryArray;
void test_fill( DirectoryArray const & x, int expected_value ) {
for( size_t i = 0; i < x.size(); ++i ) {
assert( x[ i ] == expected_value );
}
}
int main() {
DirectoryArray directory;
directory.fill( -1 );
test_fill( directory, -1 );
return 0;
}
Using array requires use of "-std=c++0x" for compiling (applies to the above code).
If that is not available or if that is not an option, then the other options like std::fill() (as suggested by GMan) or hand coding the a fill() method may be opted.
If you had a smaller number of elements you could specify them one after the other. Array initialization works by specifying each element, not by specifying a single value that applies for each element.
int x[3] = {-1, -1, -1 };
You could also use a vector and use the constructor to initialize all of the values. You can later access the raw array buffer by specifying &v.front()
std::vector directory(100, -1);
There is a C way to do it also using memset or various other similar functions. memset works for each char in your specified buffer though so it will work fine for values like 0 but may not work depending on how negative numbers are stored for -1.
You can also use STL to initialize your array by using fill_n. For a general purpose action to each element you could use for_each.
fill_n(directory, 100, -1);
Or if you really want you can go the lame way, you can do a for loop with 100 iterations and doing directory[i] = -1;
If you really need arrays, you can use boosts array class. It's assign member does the job:
boost::array<int,N> array; // boost arrays are of fixed size!
array.assign(-1);
It does work right. Your expectation of the initialiser is incorrect. If you really wish to take this approach, you'll need 100 comma-separated -1s in the initialiser. But then what happens when you increase the size of the array?
use vector of int instead a array.
vector<int> directory(100,-1); // 100 ints with value 1
It is working right. That's how list initializers work.
I believe 6.7.8.10 of the C99 standard covers this:
If an object that has automatic
storage duration is not initialized
explicitly, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned)
zero;
if it is an aggregate, every member is initialized (recursively) according
to these rules;
if it is a union, the first named member is initialized (recursively)
according to these rules.
If you need to make all the elements in an array the same non-zero value, you'll have to use a loop or memset.
Also note that, unless you really know what you're doing, vectors are preferred over arrays in C++:
Here's what you need to realize about containers vs. arrays:
Container classes make programmers more productive. So if you insist on using arrays while those around are willing to use container classes, you'll probably be less productive than they are (even if you're smarter and more experienced than they are!).
Container classes let programmers write more robust code. So if you insist on using arrays while those around are willing to use container classes, your code will probably have more bugs than their code (even if you're smarter and more experienced).
And if you're so smart and so experienced that you can use arrays as fast and as safe as they can use container classes, someone else will probably end up maintaining your code and they'll probably introduce bugs. Or worse, you'll be the only one who can maintain your code so management will yank you from development and move you into a full-time maintenance role — just what you always wanted!
There's a lot more to the linked question; give it a read.
u simply use for loop as done below:-
for (int i=0; i<100; i++)
{
a[i]= -1;
}
as a result as u want u can get
A[100]={-1,-1,-1..........(100 times)}
I had the same question and I found how to do, the documentation give the following example :
std::array<int, 3> a1{ {1, 2, 3} }; // double-braces required in C++11 (not in C++14)
So I just tried :
std::array<int, 3> a1{ {1} }; // double-braces required in C++11 (not in C++14)
And it works all elements have 1 as value. It does not work with the = operator. It is maybe a C++11 issue.
Can't do what you're trying to do with a raw array (unless you explicitly list out all 100 -1s in the initializer list), you can do it with a vector:
vector<int> directory(100, -1);
Additionally, you can create the array and set the values to -1 using one of the other methods mentioned.
Just use this loop.
for(int i =0 ; i < 100 ; i++) directory[i] =0;
the almighty memset() will do the job for array and std containers in C/C++/C++11/C++14
The reason that int directory[100] = {-1} doesn't work is because of what happens with array initialization.
All array elements that are not initialized explicitly are initialized implicitly the same way as objects that have static storage duration.
ints which are implicitly initialized are:
initialized to unsigned zero
All array elements that are not initialized explicitly are initialized implicitly the same way as objects that have static storage duration.
C++11 introduced begin and end which are specialized for arrays!
This means that given an array (not just a pointer), like your directory you can use fill as has been suggested in several answers:
fill(begin(directory), end(directory), -1)
Let's say that you write code like this, but then decide to reuse the functionality after having forgotten how you implemented it, but you decided to change the size of directory to 60. If you'd written code using begin and end then you're done.
If on the other hand you'd done this: fill(directory, directory + 100, -1) then you'd better remember to change that 100 to a 60 as well or you'll get undefined behavior.
If you are allowed to use std::array, you can do the following:
#include <iostream>
#include <algorithm>
#include <array>
using namespace std;
template <class Elem, Elem pattern, size_t S, size_t L>
struct S_internal {
template <Elem... values>
static array<Elem, S> init_array() {
return S_internal<Elem, pattern, S, L - 1>::init_array<values..., pattern>();
}
};
template <class Elem, Elem pattern, size_t S>
struct S_internal<Elem, pattern, S, 0> {
template <Elem... values>
static array<Elem, S> init_array() {
static_assert(S == sizeof...(values), "");
return array<Elem, S> {{values...}};
}
};
template <class Elem, Elem pattern, size_t S>
struct init_array
{
static array<Elem, S> get() {
return S_internal<Elem, pattern, S, S>::init_array<>();
}
};
void main()
{
array<int, 5> ss = init_array<int, 77, 5>::get();
copy(cbegin(ss), cend(ss), ostream_iterator<int>(cout, " "));
}
The output is:
77 77 77 77 77
Just use the fill_n() method.
Example
int n;
cin>>n;
int arr[n];
int value = 9;
fill_n(arr, n, value); // 9 9 9 9 9...
Learn More about fill_n()
or
you can use the fill() method.
Example
int n;
cin>>n;
int arr[n];
int value = 9;
fill(arr, arr+n, value); // 9 9 9 9 9...
Learn More about fill() method.
Note: Both these methods are available in algorithm library (#include<algorithm>). Don't forget to include it.
Starting with C++11 you could also use a range based loop:
int directory[10];
for (auto& value: directory) value = -1;
I want to create an adjacency matrix for a graph. Since I read it is not safe to use arrays of the form matrix[x][y] because they don't check for range, I decided to use the vector template class of the stl. All I need to store in the matrix are boolean values. So my question is, if using std::vector<std::vector<bool>* >* produces too much overhead or if there is a more simple way for a matrix and how I can properly initialize it.
EDIT: Thanks a lot for the quick answers. I just realized, that of course I don't need any pointers. The size of the matrix will be initialized right in the beginning and won't change until the end of the program. It is for a school project, so it would be good if I write "nice" code, although technically performance isn't too important. Using the STL is fine. Using something like boost, is probably not appreciated.
Note that also you can use boost.ublas for matrix creation and manipulation and also boost.graph to represent and manipulate graphs in a number of ways, as well as using algorithms on them, etc.
Edit: Anyway, doing a range-check version of a vector for your purposes is not a hard thing:
template <typename T>
class BoundsMatrix
{
std::vector<T> inner_;
unsigned int dimx_, dimy_;
public:
BoundsMatrix (unsigned int dimx, unsigned int dimy)
: dimx_ (dimx), dimy_ (dimy)
{
inner_.resize (dimx_*dimy_);
}
T& operator()(unsigned int x, unsigned int y)
{
if (x >= dimx_ || y>= dimy_)
throw std::out_of_range("matrix indices out of range"); // ouch
return inner_[dimx_*y + x];
}
};
Note that you would also need to add the const version of the operators, and/or iterators, and the strange use of exceptions, but you get the idea.
Best way:
Make your own matrix class, that way you control every last aspect of it, including range checking.
eg. If you like the "[x][y]" notation, do this:
class my_matrix {
std::vector<std::vector<bool> >m;
public:
my_matrix(unsigned int x, unsigned int y) {
m.resize(x, std::vector<bool>(y,false));
}
class matrix_row {
std::vector<bool>& row;
public:
matrix_row(std::vector<bool>& r) : row(r) {
}
bool& operator[](unsigned int y) {
return row.at(y);
}
};
matrix_row& operator[](unsigned int x) {
return matrix_row(m.at(x));
}
};
// Example usage
my_matrix mm(100,100);
mm[10][10] = true;
nb. If you program like this then C++ is just as safe as all those other "safe" languages.
The standard vector does NOT do range checking by default.
i.e. The operator[] does not do a range check.
The method at() is similar to [] but does do a range check.
It will throw an exception on out of range.
std::vector::at()
std::vector::operator[]()
Other notes:
Why a vector<Pointers> ?
You can quite easily have a vector<Object>. Now there is no need to worry about memory management (i.e. leaks).
std::vector<std::vector<bool> > m;
Note: vector<bool> is overloaded and not very efficient (i.e. this structure was optimized for size not speed) (It is something that is now recognized as probably a mistake by the standards committee).
If you know the size of the matrix at compile time you could use std::bitset?
std::vector<std::bitset<5> > m;
or if it is runtime defined use boost::dynamic_bitset
std::vector<boost::dynamic_bitset> m;
All of the above will allow you to do:
m[6][3] = true;
If you want 'C' array performance, but with added safety and STL-like semantics (iterators, begin() & end() etc), use boost::array.
Basically it's a templated wrapper for 'C'-arrays with some NDEBUG-disable-able range checking asserts (and also some std::range_error exception-throwing accessors).
I use stuff like
boost::array<boost::array<float,4>,4> m;
instead of
float m[4][4];
all the time and it works great (with appropriate typedefs to keep the verbosity down, anyway).
UPDATE: Following some discussion in the comments here of the relative performance of boost::array vs boost::multi_array, I'd point out that this code, compiled with g++ -O3 -DNDEBUG on Debian/Lenny amd64 on a Q9450 with 1333MHz DDR3 RAM takes 3.3s for boost::multi_array vs 0.6s for boost::array.
#include <iostream>
#include <time.h>
#include "boost/array.hpp"
#include "boost/multi_array.hpp"
using namespace boost;
enum {N=1024};
typedef multi_array<char,3> M;
typedef array<array<array<char,N>,N>,N> C;
// Forward declare to avoid being optimised away
static void clear(M& m);
static void clear(C& c);
int main(int,char**)
{
const clock_t t0=clock();
{
M m(extents[N][N][N]);
clear(m);
}
const clock_t t1=clock();
{
std::auto_ptr<C> c(new C);
clear(*c);
}
const clock_t t2=clock();
std::cout
<< "multi_array: " << (t1-t0)/static_cast<float>(CLOCKS_PER_SEC) << "s\n"
<< "array : " << (t2-t1)/static_cast<float>(CLOCKS_PER_SEC) << "s\n";
return 0;
}
void clear(M& m)
{
for (M::index i=0;i<N;i++)
for (M::index j=0;j<N;j++)
for (M::index k=0;k<N;k++)
m[i][j][k]=1;
}
void clear(C& c)
{
for (int i=0;i<N;i++)
for (int j=0;j<N;j++)
for (int k=0;k<N;k++)
c[i][j][k]=1;
}
What I would do is create my own class for dealing with matrices (probably as an array[x*y] because I'm more used to C (and I'd have my own bounds checking), but you could use vectors or any other sub-structure in that class).
Get your stuff functional first then worry about how fast it runs. If you design the class properly, you can pull out your array[x*y] implementation and replace it with vectors or bitmasks or whatever you want without changing the rest of the code.
I'm not totally sure, but I thing that's what classes were meant for, the ability to abstract the implementation well out of sight and provide only the interface :-)
In addition to all the answers that have been posted so far, you might do well to check out the C++ FAQ Lite. Questions 13.10 - 13.12 and 16.16 - 16.19 cover several topics related to rolling your own matrix class. You'll see a couple of different ways to store the data and suggestions on how to best write the subscript operators.
Also, if your graph is sufficiently sparse, you may not need a matrix at all. You could use std::multimap to map each vertex to those it connects.
my favourite way to store a graph is vector<set<int>>; n elements in vector (nodes 0..n-1), >=0 elements in each set (edges). Just do not forget adding a reverse copy of every bi-directional edge.
Consider also how big is your graph/matrix, does performance matter a lot? Is the graph static, or can it grow over time, e.g. by adding new edges?
Probably, not relevant as this is an old question, but you can use the Armadillo library, which provides many linear algebra oriented data types and functions.
Below is an example for your specific problem:
// In C++11
Mat<bool> matrix = {
{ true, true},
{ false, false},
};
// In C++98
Mat<bool> matrix;
matrix << true << true << endr
<< false << false << endr;
Mind you std::vector doesn't do range checking either.
I'm writing an inner loop that needs to place structs in contiguous storage. I don't know how many of these structs there will be ahead of time. My problem is that STL's vector initializes its values to 0, so no matter what I do, I incur the cost of the initialization plus the cost of setting the struct's members to their values.
Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?
(I'm certain that this part of the code needs to be optimized, and I'm certain that the initialization is a significant cost.)
Also, see my comments below for a clarification about when the initialization occurs.
SOME CODE:
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size()
memberVector.resize(mvSize + count); // causes 0-initialization
for (int i = 0; i < count; ++i) {
memberVector[mvSize + i].d1 = data1[i];
memberVector[mvSize + i].d2 = data2[i];
}
}
std::vector must initialize the values in the array somehow, which means some constructor (or copy-constructor) must be called. The behavior of vector (or any container class) is undefined if you were to access the uninitialized section of the array as if it were initialized.
The best way is to use reserve() and push_back(), so that the copy-constructor is used, avoiding default-construction.
Using your example code:
struct YourData {
int d1;
int d2;
YourData(int v1, int v2) : d1(v1), d2(v2) {}
};
std::vector<YourData> memberVector;
void GetsCalledALot(int* data1, int* data2, int count) {
int mvSize = memberVector.size();
// Does not initialize the extra elements
memberVector.reserve(mvSize + count);
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a temporary.
memberVector.push_back(YourData(data1[i], data2[i]));
}
}
The only problem with calling reserve() (or resize()) like this is that you may end up invoking the copy-constructor more often than you need to. If you can make a good prediction as to the final size of the array, it's better to reserve() the space once at the beginning. If you don't know the final size though, at least the number of copies will be minimal on average.
In the current version of C++, the inner loop is a bit inefficient as a temporary value is constructed on the stack, copy-constructed to the vectors memory, and finally the temporary is destroyed. However the next version of C++ has a feature called R-Value references (T&&) which will help.
The interface supplied by std::vector does not allow for another option, which is to use some factory-like class to construct values other than the default. Here is a rough example of what this pattern would look like implemented in C++:
template <typename T>
class my_vector_replacement {
// ...
template <typename F>
my_vector::push_back_using_factory(F factory) {
// ... check size of array, and resize if needed.
// Copy construct using placement new,
new(arrayData+end) T(factory())
end += sizeof(T);
}
char* arrayData;
size_t end; // Of initialized data in arrayData
};
// One of many possible implementations
struct MyFactory {
MyFactory(int* p1, int* p2) : d1(p1), d2(p2) {}
YourData operator()() const {
return YourData(*d1,*d2);
}
int* d1;
int* d2;
};
void GetsCalledALot(int* data1, int* data2, int count) {
// ... Still will need the same call to a reserve() type function.
// Note: consider using std::generate_n or std::copy instead of this loop.
for (int i = 0; i < count; ++i) {
// Copy construct using a factory
memberVector.push_back_using_factory(MyFactory(data1+i, data2+i));
}
}
Doing this does mean you have to create your own vector class. In this case it also complicates what should have been a simple example. But there may be times where using a factory function like this is better, for instance if the insert is conditional on some other value, and you would have to otherwise unconditionally construct some expensive temporary even if it wasn't actually needed.
In C++11 (and boost) you can use the array version of unique_ptr to allocate an uninitialized array. This isn't quite an stl container, but is still memory managed and C++-ish which will be good enough for many applications.
auto my_uninit_array = std::unique_ptr<mystruct[]>(new mystruct[count]);
C++0x adds a new member function template emplace_back to vector (which relies on variadic templates and perfect forwarding) that gets rid of any temporaries entirely:
memberVector.emplace_back(data1[i], data2[i]);
To clarify on reserve() responses: you need to use reserve() in conjunction with push_back(). This way, the default constructor is not called for each element, but rather the copy constructor. You still incur the penalty of setting up your struct on stack, and then copying it to the vector. On the other hand, it's possible that if you use
vect.push_back(MyStruct(fieldValue1, fieldValue2))
the compiler will construct the new instance directly in the memory thatbelongs to the vector. It depends on how smart the optimizer is. You need to check the generated code to find out.
You can use boost::noinit_adaptor to default initialize new elements (which is no initialization for built-in types):
std::vector<T, boost::noinit_adaptor<std::allocator<T>> memberVector;
As long as you don't pass an initializer into resize, it default initializes the new elements.
So here's the problem, resize is calling insert, which is doing a copy construction from a default constructed element for each of the newly added elements. To get this to 0 cost you need to write your own default constructor AND your own copy constructor as empty functions. Doing this to your copy constructor is a very bad idea because it will break std::vector's internal reallocation algorithms.
Summary: You're not going to be able to do this with std::vector.
You can use a wrapper type around your element type, with a default constructor that does nothing. E.g.:
template <typename T>
struct no_init
{
T value;
no_init() { static_assert(std::is_standard_layout<no_init<T>>::value && sizeof(T) == sizeof(no_init<T>), "T does not have standard layout"); }
no_init(T& v) { value = v; }
T& operator=(T& v) { value = v; return value; }
no_init(no_init<T>& n) { value = n.value; }
no_init(no_init<T>&& n) { value = std::move(n.value); }
T& operator=(no_init<T>& n) { value = n.value; return this; }
T& operator=(no_init<T>&& n) { value = std::move(n.value); return this; }
T* operator&() { return &value; } // So you can use &(vec[0]) etc.
};
To use:
std::vector<no_init<char>> vec;
vec.resize(2ul * 1024ul * 1024ul * 1024ul);
Err...
try the method:
std::vector<T>::reserve(x)
It will enable you to reserve enough memory for x items without initializing any (your vector is still empty). Thus, there won't be reallocation until to go over x.
The second point is that vector won't initialize the values to zero. Are you testing your code in debug ?
After verification on g++, the following code:
#include <iostream>
#include <vector>
struct MyStruct
{
int m_iValue00 ;
int m_iValue01 ;
} ;
int main()
{
MyStruct aaa, bbb, ccc ;
std::vector<MyStruct> aMyStruct ;
aMyStruct.push_back(aaa) ;
aMyStruct.push_back(bbb) ;
aMyStruct.push_back(ccc) ;
aMyStruct.resize(6) ; // [EDIT] double the size
for(std::vector<MyStruct>::size_type i = 0, iMax = aMyStruct.size(); i < iMax; ++i)
{
std::cout << "[" << i << "] : " << aMyStruct[i].m_iValue00 << ", " << aMyStruct[0].m_iValue01 << "\n" ;
}
return 0 ;
}
gives the following results:
[0] : 134515780, -16121856
[1] : 134554052, -16121856
[2] : 134544501, -16121856
[3] : 0, -16121856
[4] : 0, -16121856
[5] : 0, -16121856
The initialization you saw was probably an artifact.
[EDIT] After the comment on resize, I modified the code to add the resize line. The resize effectively calls the default constructor of the object inside the vector, but if the default constructor does nothing, then nothing is initialized... I still believe it was an artifact (I managed the first time to have the whole vector zerooed with the following code:
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
aMyStruct.push_back(MyStruct()) ;
So...
:-/
[EDIT 2] Like already offered by Arkadiy, the solution is to use an inline constructor taking the desired parameters. Something like
struct MyStruct
{
MyStruct(int p_d1, int p_d2) : d1(p_d1), d2(p_d2) {}
int d1, d2 ;
} ;
This will probably get inlined in your code.
But you should anyway study your code with a profiler to be sure this piece of code is the bottleneck of your application.
I tested a few of the approaches suggested here.
I allocated a huge set of data (200GB) in one container/pointer:
Compiler/OS:
g++ (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0
Settings: (c++-17, -O3 optimizations)
g++ --std=c++17 -O3
I timed the total program runtime with linux-time
1.) std::vector:
#include <vector>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
}
real 0m36.246s
user 0m4.549s
sys 0m31.604s
That is 36 seconds.
2.) std::vector with boost::noinit_adaptor
#include <vector>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
}
real 0m0.002s
user 0m0.001s
sys 0m0.000s
So this solves the problem. Just allocating without initializing costs basically nothing (at least for large arrays).
3.) std::unique_ptr<T[]>:
#include <memory>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
auto data = std::unique_ptr<size_t[]>(new size_t[size]);
}
real 0m0.002s
user 0m0.002s
sys 0m0.000s
So basically the same performance as 2.), but does not require boost.
I also tested simple new/delete and malloc/free with the same performance as 2.) and 3.).
So the default-construction can have a huge performance penalty if you deal with large data sets.
In practice you want to actually initialize the allocated data afterwards.
However, some of the performance penalty still remains, especially if the later initialization is performed in parallel.
E.g., I initialize a huge vector with a set of (pseudo)random numbers:
(now I use fopenmp for parallelization on a 24 core AMD Threadripper 3960X)
g++ --std=c++17-fopenmp -O3
1.) std::vector:
#include <vector>
#include <random>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m41.958s
user 4m37.495s
sys 0m31.348s
That is 42s, only 6s more than the default initialization.
The problem is, that the initialization of std::vector is sequential.
2.) std::vector with boost::noinit_adaptor:
#include <vector>
#include <random>
#include <boost/core/noinit_adaptor.hpp>
int main(){
constexpr size_t size = 1024lu*1024lu*1024lu*25lu;//25B elements = 200GB
std::vector<size_t,boost::noinit_adaptor<std::allocator<size_t>>> vec(size);
#pragma omp parallel
{
std::minstd_rand0 gen(42);
#pragma omp for schedule(static)
for (size_t i = 0; i < size; ++i) vec[i] = gen();
}
}
real 0m10.508s
user 1m37.665s
sys 3m14.951s
So even with the random-initialization, the code is 4 times faster because we can skip the sequential initialization of std::vector.
So if you deal with huge data sets and plan to initialize them afterwards in parallel, you should avoid using the default std::vector.
From your comments to other posters, it looks like you're left with malloc() and friends. Vector won't let you have unconstructed elements.
From your code, it looks like you have a vector of structs each of which comprises 2 ints. Could you instead use 2 vectors of ints? Then
copy(data1, data1 + count, back_inserter(v1));
copy(data2, data2 + count, back_inserter(v2));
Now you don't pay for copying a struct each time.
If you really insist on having the elements uninitialized and sacrifice some methods like front(), back(), push_back(), use boost vector from numeric . It allows you even not to preserve existing elements when calling resize()...
I'm not sure about all those answers that says it is impossible or tell us about undefined behavior.
Sometime, you need to use an std::vector. But sometime, you know the final size of it. And you also know that your elements will be constructed later.
Example : When you serialize the vector contents into a binary file, then read it back later.
Unreal Engine has its TArray::setNumUninitialized, why not std::vector ?
To answer the initial question
"Is there any way to prevent the initialization, or is there an STL-like container out there with resizeable contiguous storage and uninitialized elements?"
yes and no.
No, because STL doesn't expose a way to do so.
Yes because we're coding in C++, and C++ allows to do a lot of thing. If you're ready to be a bad guy (and if you really know what you are doing). You can hijack the vector.
Here a sample code that works only for the Windows's STL implementation, for another platform, look how std::vector is implemented to use its internal members :
// This macro is to be defined before including VectorHijacker.h. Then you will be able to reuse the VectorHijacker.h with different objects.
#define HIJACKED_TYPE SomeStruct
// VectorHijacker.h
#ifndef VECTOR_HIJACKER_STRUCT
#define VECTOR_HIJACKER_STRUCT
struct VectorHijacker
{
std::size_t _newSize;
};
#endif
template<>
template<>
inline decltype(auto) std::vector<HIJACKED_TYPE, std::allocator<HIJACKED_TYPE>>::emplace_back<const VectorHijacker &>(const VectorHijacker &hijacker)
{
// We're modifying directly the size of the vector without passing by the extra initialization. This is the part that relies on how the STL was implemented.
_Mypair._Myval2._Mylast = _Mypair._Myval2._Myfirst + hijacker._newSize;
}
inline void setNumUninitialized_hijack(std::vector<HIJACKED_TYPE> &hijackedVector, const VectorHijacker &hijacker)
{
hijackedVector.reserve(hijacker._newSize);
hijackedVector.emplace_back<const VectorHijacker &>(hijacker);
}
But beware, this is hijacking we're speaking about. This is really dirty code, and this is only to be used if you really know what you are doing. Besides, it is not portable and relies heavily on how the STL implementation was done.
I won't advise you to use it because everyone here (me included) is a good person. But I wanted to let you know that it is possible contrary to all previous answers that stated it wasn't.
Use the std::vector::reserve() method. It won't resize the vector, but it will allocate the space.
Do the structs themselves need to be in contiguous memory, or can you get away with having a vector of struct*?
Vectors make a copy of whatever you add to them, so using vectors of pointers rather than objects is one way to improve performance.
I don't think STL is your answer. You're going to need to roll your own sort of solution using realloc(). You'll have to store a pointer and either the size, or number of elements, and use that to find where to start adding elements after a realloc().
int *memberArray;
int arrayCount;
void GetsCalledALot(int* data1, int* data2, int count) {
memberArray = realloc(memberArray, sizeof(int) * (arrayCount + count);
for (int i = 0; i < count; ++i) {
memberArray[arrayCount + i].d1 = data1[i];
memberArray[arrayCount + i].d2 = data2[i];
}
arrayCount += count;
}
I would do something like:
void GetsCalledALot(int* data1, int* data2, int count)
{
const size_t mvSize = memberVector.size();
memberVector.reserve(mvSize + count);
for (int i = 0; i < count; ++i) {
memberVector.push_back(MyType(data1[i], data2[i]));
}
}
You need to define a ctor for the type that is stored in the memberVector, but that's a small cost as it will give you the best of both worlds; no unnecessary initialization is done and no reallocation will occur during the loop.