Here's an interesting question about the various quirks of the C++ language. I have a pair of functions, which are supposed to fill an array of points with the corners of a rectangle. There are two overloads for it: one takes a Point[5], the other takes a Point[4]. The 5-point version refers to a closed polygon, whereas the 4-point version is when you just want the 4 corners, period.
Obviously there's some duplication of work here, so I'd like to be able to use the 4-point version to populate the first 4 points of the 5-point version, so I'm not duplicating that code. (Not that it's much to duplicate, but I have terrible allergic reactions whenever I copy and paste code, and I'd like to avoid that.)
The thing is, C++ doesn't seem to care for the idea of converting a T[m] to a T[n] where n < m. static_cast seems to think the types are incompatible for some reason. reinterpret_cast handles it fine, of course, but is a dangerous animal that, as a general rule, is better to avoid if at all possible.
So my question is: is there a type-safe way of casting an array of one size to an array of a smaller size where the array type is the same?
[Edit] Code, yes. I should have mentioned that the parameter is actually a reference to an array, not simply a pointer, so the compiler is aware of the type difference.
void RectToPointArray(const degRect& rect, degPoint(&points)[4])
{
points[0].lat = rect.nw.lat; points[0].lon = rect.nw.lon;
points[1].lat = rect.nw.lat; points[1].lon = rect.se.lon;
points[2].lat = rect.se.lat; points[2].lon = rect.se.lon;
points[3].lat = rect.se.lat; points[3].lon = rect.nw.lon;
}
void RectToPointArray(const degRect& rect, degPoint(&points)[5])
{
// I would like to use a more type-safe check here if possible:
RectToPointArray(rect, reinterpret_cast<degPoint(&)[4]> (points));
points[4].lat = rect.nw.lat; points[4].lon = rect.nw.lon;
}
[Edit2] The point of passing an array-by-reference is so that we can be at least vaguely sure that the caller is passing in a correct "out parameter".
I don't think it's a good idea to do this by overloading. The name of the function doesn't tell the caller whether it's going to fill an open array or not. And what if the caller has only a pointer and wants to fill coordinates (let's say he wants to fill multiple rectangles to be part of a bigger array at different offsets)?
I would do this by two functions, and let them take pointers. The size isn't part of the pointer's type
void fillOpenRect(degRect const& rect, degPoint *p) {
...
}
void fillClosedRect(degRect const& rect, degPoint *p) {
fillOpenRect(rect, p); p[4] = p[0];
}
I don't see what's wrong with this. Your reinterpret-cast should work fine in practice (i don't see what could go wrong - both alignment and representation will be correct, so the merely formal undefinedness won't carry out to reality here, i think), but as i said above i think there's no good reason to make these functions take the arrays by reference.
If you want to do it generically, you can write it by output iterators
template<typename OutputIterator>
OutputIterator fillOpenRect(degRect const& rect, OutputIterator out) {
typedef typename iterator_traits<OutputIterator>::value_type value_type;
value_type pt[] = {
{ rect.nw.lat, rect.nw.lon },
{ rect.nw.lat, rect.se.lon },
{ rect.se.lat, rect.se.lon },
{ rect.se.lat, rect.nw.lon }
};
for(int i = 0; i < 4; i++)
*out++ = pt[i];
return out;
}
template<typename OutputIterator>
OutputIterator fillClosedRect(degRect const& rect, OutputIterator out) {
typedef typename iterator_traits<OutputIterator>::value_type value_type;
out = fillOpenRect(rect, out);
value_type p1 = { rect.nw.lat, rect.nw.lon };
*out++ = p1;
return out;
}
You can then use it with vectors and also with arrays, whatever you prefer most.
std::vector<degPoint> points;
fillClosedRect(someRect, std::back_inserter(points));
degPoint points[5];
fillClosedRect(someRect, points);
If you want to write safer code, you can use the vector way with back-inserters, and if you work with lower level code, you can use a pointer as output iterator.
I would use std::vector or (this is really bad and should not be used) in some extreme cases you can even use plain arrays via pointer like Point* and then you shouldn't have such "casting" troubles.
Why don't you just pass a standard pointer, instead of a sized one, like this
void RectToPointArray(const degRect& rect, degPoint * points ) ;
I don't think your framing/thinking of the problem is correct. You don't generally need to concretely type an object that has 4 vertices vs an object that has 5.
But if you MUST type it, then you can use structs to concretely define the types instead.
struct Coord
{
float lat, long ;
} ;
Then
struct Rectangle
{
Coord points[ 4 ] ;
} ;
struct Pentagon
{
Coord points[ 5 ] ;
} ;
Then,
// 4 pt version
void RectToPointArray(const degRect& rect, const Rectangle& rectangle ) ;
// 5 pt version
void RectToPointArray(const degRect& rect, const Pentagon& pent ) ;
I think this solution is a bit extreme however, and a std::vector<Coord> that you check its size (to be either 4 or 5) as expected with asserts, would do just fine.
I guess you could use function template specialization, like this (simplified example where first argument was ignored and function name was replaced by f(), etc.):
#include <iostream>
using namespace std;
class X
{
};
template<int sz, int n>
int f(X (&x)[sz])
{
cout<<"process "<<n<<" entries in a "<<sz<<"-dimensional array"<<endl;
int partial_result=f<sz,n-1>(x);
cout<<"process last entry..."<<endl;
return n;
}
//template specialization for sz=5 and n=4 (number of entries to process)
template<>
int f<5,4>(X (&x)[5])
{
cout<<"process only the first "<<4<<" entries here..."<<endl;
return 4;
}
int main(void)
{
X u[5];
int res=f<5,5>(u);
return 0;
}
Of course you would have to take care of other (potentially dangerous) special cases like n={0,1,2,3} and you're probably better off using unsigned int's instead of ints.
So my question is: is there a
type-safe way of casting an array of
one size to an array of a smaller size
where the array type is the same?
No. I don't think the language allows you to do this at all: consider casting int[10] to int[5]. You can always get a pointer to it, however, but we can't 'trick' the compiler into thinking a fixed-sized has a different number of dimensions.
If you're not going to use std::vector or some other container which can properly identify the number of points inside at runtime and do this all conveniently in one function instead of two function overloads which get called based on the number of elements, rather than trying to do crazy casts, consider this at least as an improvement:
void RectToPointArray(const degRect& rect, degPoint* points, unsigned int size);
If you're set on working with arrays, you can still define a generic function like this:
template <class T, size_t N>
std::size_t array_size(const T(&/*array*/)[N])
{
return N;
}
... and use that when calling RectToPointArray to pass the argument for 'size'. Then you have a size you can determine at runtime and it's easy enough to work with size - 1, or more appropriate for this case, just put a simple if statement to check if there are 5 elements or 4.
Later if you change your mind and use std::vector, Boost.Array, etc. you can still use this same old function without modifying it. It only requires that the data is contiguous and mutable. You can get fancy with this and apply very generic solutions that, say, only require forward iterators. Yet I don't think this problem is complicated enough to warrant such a solution: it'd be like using a cannon to kill a fly; fly swatter is okay.
If you're really set on the solution you have, then it's easy enough to do this:
template <size_t N>
void RectToPointArray(const degRect& rect, degPoint(&points)[N])
{
assert(N >= 4 && "points requires at least 4 elements!");
points[0].lat = rect.nw.lat; points[0].lon = rect.nw.lon;
points[1].lat = rect.nw.lat; points[1].lon = rect.se.lon;
points[2].lat = rect.se.lat; points[2].lon = rect.se.lon;
points[3].lat = rect.se.lat; points[3].lon = rect.nw.lon;
if (N >= 5)
points[4].lat = rect.nw.lat; points[4].lon = rect.nw.lon;
}
Yeah, there is one unnecessary runtime check but trying to do it at compile time is probably analogous to taking things out of your glove compartment in an attempt to increase your car's fuel efficiency. With N being a compile-time constant expression, the compiler is likely to recognize that the condition is always false when N < 5 and just eliminate that whole section of code.
Related
There are many design issues I have found with this, particularly with passing std::array<> to functions. Basically, when you initialize std::array, it takes in two template parameters, <class T and size_t size>. However, when you create a function that requires and std::array, we do not know the size, so we need to create template parameters for the functions also.
template <size_t params_size> auto func(std::array<int, params_size> arr);
Why couldn't std::array take in the size at the constructor instead? (i.e.):
auto array = std::array<int>(10);
Then the functions would look less aggressive and would not require template params, as such:
auto func (std::array<int> arr);
I just want to know the design choice for std::array, and why it was designed this way.
This isn't a question due to a bug, but rather a question why std::array<> was designed in such a manner.
std::array<T,N> var is intended as a better replacement for C-style arrays T var[N].
The memory space for this object is created locally, ie on the stack for local variables or inside the struct itself when defined as a member.
std::vector<T> in contrary always allocate it's element's memory in the heap.
Therefore as std::array is allocated locally, it cannot have a variable size since that space needs to be reserved at compile time. std::vector in the other hand has the ability to reallocate and resize since its memory is unbounded.
As a consequence, the big advantage of std::array in terms of performance is that it eliminates that one level of indirection that std::vector pays for its flexibility.
For example:
#include <cstdint>
#include <iostream>
#include <vector>
#include <array>
int main() {
int a;
char b[10];
std::vector<char> c(10);
std::array<char,10> d;
struct E {
std::array<char,10> e1;
std::vector<char> e2{10};
};
E e;
printf( "Stack address: %p\n", __builtin_frame_address(0));
printf( "Address of a: %p\n", &a );
printf( "Address of b: %p\n", b );
printf( "Address of b[0]: %p\n", &b[0] );
printf( "Address of c: %p\n", &c );
printf( "Address of c[0]: %p\n", &c[0] );
printf( "Address of d: %p\n", &d );
printf( "Address of d[0]: %p\n", &d[0] );
printf( "Address of e: %p\n", &e );
printf( "Address of e1: %p\n", &e.e1 );
printf( "Address of e1[0]:%p\n", &e.e1[0] );
printf( "Address of e2: %p\n", &e.e2);
printf( "Address of e2[0]:%p\n", &e.e2[0] );
}
Produces
Program stdout
Stack address: 0x7fffeb115ed0
Address of a: 0x7fffeb115eb0
Address of b: 0x7fffeb115ea6
Address of b[0]: 0x7fffeb115ea6
Address of c: 0x7fffeb115e80
Address of c[0]: 0x1cad2b0
Address of d: 0x7fffeb115e76
Address of d[0]: 0x7fffeb115e76
Address of e: 0x7fffeb115e40
Address of e1: 0x7fffeb115e40
Address of e1[0]:0x7fffeb115e40
Address of e2: 0x7fffeb115e50
Address of e2[0]:0x1cad2d0
Godbolt: https://godbolt.org/z/75s47T56f
Not an answer, really, because I used to despise std::array<> for the same reasons as you — anything with Monadic qualities are not good design (IMNSHO).
Fortunately, C++20 has the solution: a dynamic std::span<>.
#include <array>
#include <iostream>
#include <span>
namespace detail
{
void print( const std::span<const int> & xs )
{
for (size_t n = 0; n < xs.size(); n++)
std::cout << xs[n] << " ";
}
}
void print( const std::span<const int> & xs )
{
std::cout << "{ ";
detail::print( xs );
std::cout << "}\n";
}
void add( const std::span<int> & xs, int n )
{
for (int & x : xs)
x += n;
}
int main()
{
std::array<int,5> xs { 1, 2, 4, 6, 10 };
add( xs, 1 );
print( xs );
}
Notice that the span itself is const in all cases, but the elements themselves are modifiable unless they too are tagged const. This is exactly what an array is like.
std::span is a C++20 object. I know that MS and maybe others had a array_view in older versions of their libraries.
tl;dr
Use std::array only to declare your array object. Pass it around with a dynamic std::span.
std::array vs C array
The use-case for std::array is actually very narrow: encapsulate a fixed-size array as a first-class container object (one that can be copied, not just referenced).
At first blush this doesn’t seem to be much of an improvement over standard C-style arrays:
typedef int myarray[10]; // (1)
using myarray = std::array<int,10>; // (2)
void f( myarray a );
But it is! The difference is in what f() actually gets:
For a C-style array, the argument is just a pointer — a reference to the caller’s data (that you can modify!). You know the size of the referenced array (10), but writing code to get that size is not straight-forward even with the usual C array-size idiom (sizeof(myarray)/sizeof(a[0]), since sizeof(a) is the size of a pointer).
For the std::array, the argument value is an actual local copy of the caller’s data. If you want to be able to modify the caller’s data then you need to be explicit about declaring the formal argument as a reference type (myarray & a) or just to avoid an expensive copy (const myarray & a). This falls in line with how other C++ objects are passed. And though the size is still 10, your code can query the size of the array with the usual C++ container idiom: a.size()!
The usual way C overcomes this is to clutter the call site and formal argument lists with information about the array size so that it doesn’t get lost.
int f( int array[], size_t n ) // traditional C
{
printf( "There are %zu elements.\n", n );
recurse with f( array, n );
}
int main(void)
{
int my_array[10];
f( my_array, ARRAY_SIZE(my_array) );
The std::array way is cleaner.
int f( std::array<int,10> & array ) // C++
{
std::cout << "There are " << array.size() << " elements.\n";
recurse with f( array );
}
int main()
{
std::array<int,10> my_array;
f( my_array );
But while cleaner, it is significantly less flexible than the C array, simply because its length is fixed. A caller cannot pass a std::array<int,12> to the function, for example.
I’ll refer you to the other good answers here to consider more about container choice when handling arrayed data.
If you have a problem with std::array and you think std::span is a solution, now you will have two problems.
More seriously, without knowing what kind of conceptual operation is func it is difficult to tell what is the right alternative.
First, if you want or can exploit to know the size at compile-time there is nothing cooler than what you are trying to avoid.
template<std::size_t N>
void func(std::array<int, N> arr); // add & or && or const& if appropiate
Imagine it, knowing the size at compile time can allow you and the compiler to do all sorts of tricks, like unrolling loops completely or verifying logic at compile time (e.g. if you know the size must be smaller or bigger than a constant).
Or the coolest trick of all, not needing to allocate memory for any auxiliary operation inside func (because you know the size of the problem a priori).
If you want a dynamic array, use (and pass) a std::vector.
void func(std::vector<int> dynarr); // add & or && or const& if appropiate
But then you force your caller to use std::vector as the container.
If you want a fixed array, and it will work with everything,
template<class FixedArray>
void func(FixedArray dynarr); // add & or && or const& if appropiate
Ask yourself, how specific is your function such that you really really want to make it work with any size of std::array but not with std::vector?
Why specifically ints even?
template<class ArithmeticRange>
void func(ArithmeticRange dynarr); // add & or && or const& if appropiate
There are a few contiguous containers and ranges in C++ std. They serve different purposes. There are also a few techniques for passing them around.
I'll try to be exhaustive.
std::array<int, 7>
this is a buffer of 7 ints. They are stored within the object itself. Putting an array somewhere is putting enough storage for exactly 7 ints in that location (plus possible padding for alignment reasons, but that is at the end of the buffer).
You use this when, at compile time, you know exactly how big something is, or need to know.
std::vector<int>
this object holds ownership of a buffer of ints. The memory that holds those ints is dynamically allocated and can change at runtime. The object itself is usually 3 pointers in size. It has some strategies to grow that avoids doing N^2 work when you keep adding 1 element at a time to it.
This object can be efficiently moved -- it will steal the buffer if the old object is marked (by std::move or other ways) as being safe to steal state from.
std::span<int>
This represents an externally owned sequence of ints, possibly stored in a std::array or owned by a std::vector, or stored somewhere else. It knows where in memory it starts and when it ends.
Unlike the two above, it is not a container, but a range or a view of the contents. So you can't assign spans to each other (the semantics are confusing), and you are responsible to ensure that the source buffer lasts "long enough" that you don't use it after it is gone.
span is often used as a function argument. In your case, it probably solves most of your problem -- it lets you pass arrays of different sizes to a function, and within that function you can read or write the values.
span followed pointer semantics. That means const std::span<int> is like a int*const -- the pointer is const, but the thing pointed to is not! You are free to modify the elements in const std::span<int>. In comparison, std::span<const int> is like a int const* -- the pointer is not const, but the thing pointed to is. You are free to change what range of elements the span refers to in std::span<const int>, but you aren't allowed to modify the elements themselves.
A final technique is auto or templates. Here we implement the body of the function in the header (or equivalent) and leave the type unconstrained (or, constrained by concepts).
template<std::size_t N>
int total0( std::array<int, N> const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
int total1( std::vector<int> const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
int total2( std::span<int const> elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
int total3( auto const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
template<class Ints>
int total4( Ints const& elems ) {
int r = 0;
for (int e:elems) r+=e;
return r;
}
notice these all have the same implementation.
total3 and total4 are identical; you need a more modern compiler to use total3 syntax.
total1 and total2 allow you to split the implementation into a cpp file away from the header file. Also, code generation isn't done for different arguments.
total0, total3 and total4 all result in different code to be generated based on the type of the arguments. This can cause binary bloat issues, especially if the body was more complex than shown, and causes build time problems in larger projects.
total1 won't work with a std::array directly. You can do total1({arr.begin(), arr.end()}) which would copy the contents to a dynamic vector before using the code.
Finally, note that span<int> is the closest you get to the C way of arr[], size. Span is, in essence, a pointer-to-first and length pair, with utility code wrapping it.
The main purpose of a C++11 std::array<> is to be a decent replacement for C-style arrays [], especially when they're declared with new and dismissed with delete[].
The main goal here is to get an official, managed object that serves as an array, while maintaining as constant expressions everything that can be.
Principal issues with regular arrays is that since they're not actually objects, one cannot derivate a class from them (forcing you to implement iterators) and are a pain when you copy classes that uses them as object properties.
Since new, delete and delete[] return pointers, you need each time either to implement a copy constructor that will declare another array them copy its content or maintaining your own dynamic reference counter on it.
From this perpective, std::array<> is a good way to declare purely static arrays that will be managed by the language itself.
Suppose a n-dimensional array that is passed as template argument and should be traversed in order to save it to a file. First of all I want to find out the size of the elements the array consists of. Thereto I try to dereference the pointers until I get the first element at [0][0][0]...[0]. But I already fail at this stage:
/**
* #brief save a n-dimensional array to file
*
* #param arr: the n-level-pointer to the data to be saved
* #param dimensions: pointer to array where dimensions of <arr> are stored
* #param n: number of levels / dimensions of <arr>
*/
template <typename T>
void save_array(T arr, unsigned int* dimensions, unsigned int n){
// how to put this in a loop ??
auto deref1 = *arr;
auto deref2 = *deref1;
auto deref3 = *deref2;
// do this n times, then derefn is equivalent to arr[0]...[0], 42 should be printed
std::cout << derefn << std::endl;
/* further code */
}
/*
* test call
*/
int main(){
unsigned int dim[4] = {50, 60, 80, 50}
uint8_t**** arr = new uint8_t***[50];
/* further initialization of arr, omitted here */
arr[0][0][0][0] = 42;
save_array(arr, dim, 4);
}
When I think of this from a memory perspective I want to perform a n-indirect load of a given address.
I saw a related question that was asked yesterday:
Declaring dynamic Multi-Dimensional pointer
This would help me a lot as well. One comment states it is not possible since types of all expressions must be known at compile-time. In my case there's actually known everything, all callers of save_array will have n hardcoded before passing it. So I think it could be just a matter of defining stuff at the right place what I am yet not able to.
I know I am writing C-style code in C++ and there could be options to achieve this with classes etc., but my question is: Is it possible to achieve n-level pointer dereference by an iterative or recursive approach? Thanks!
First of all: Do you really need a jagged array? Do you want to have some sort of sparse array? Because otherwise, could you not just flatten your n-dimensional structure into a single, long array? That would not just lead to much simpler code, but most likely also be more efficient.
That being said: It can be done for sure. For example, just use a recursive template and rely on overloading to peel off levels of indirection until you get to the bottom:
template <typename T>
void save_array(T* arr, unsigned int* dimensions)
{
for (unsigned int i = 0U; i < *dimensions; ++i)
std::cout << ' ' << *arr++;
std::cout << std::endl;
}
template <typename T>
void save_array(T** arr, unsigned int* dimensions)
{
for (unsigned int i = 0U; i < *dimensions; ++i)
save_array(*arr, dimensions + 1);
}
You don't even need to explicitly specify the number of indirections n, since that number is implicitly given by the pointer type.
You can do basically the same trick to allocate/deallocate the array too:
template <typename T>
struct array_builder;
template <typename T>
struct array_builder<T*>
{
T* allocate(unsigned int* dimensions) const
{
return new T[*dimensions];
}
};
template <typename T>
struct array_builder<T**> : private array_builder<T*>
{
T** allocate(unsigned int* dimensions) const
{
T** array = new T*[*dimensions];
for (unsigned int i = 0U; i < *dimensions; ++i)
array[i] = array_builder<T*>::allocate(dimensions + 1);
return array;
}
};
Just this way around, you need partial specialization since the approach using overloading only works when the type can be inferred from a parameter. Since functions cannot be partially specialized, you have to wrap it in a class template like that. Usage:
unsigned int dim[4] = { 50, 60, 80, 50 };
auto arr = array_builder<std::uint8_t****>{}.allocate(dim);
arr[0][0][0][0] = 42;
save_array(arr, dim);
Hope I didn't overlook anything; having this many indirections out in the open can get massively confusing real quick, which is why I strongly advise against ever doing this in real code unless absolutely unavoidable. Also this raw usage of new all over the place is anything but great. Ideally, you'd be using, e.g., std::unique_ptr. Or, better yet, just nested std::vectors as suggested in the comments…
Why not just use a data structure like tree with multiple child nodes.
Suppose you need to store n dimensional array values, create a node pointing to the first dimension. Say your first dimension length is 5 then you have 5 child nodes and if your 2nd dimension size is 10. Then for each of these 5 node you have 10 child nodes and so on....
Some thing like,
struct node{
int index;
int dimension;
vector<node*> children;
}
It will be easier to traverse through tree and is much cleaner.
I got this library of mathematical routines ( without documentation ) to work on some task at college. The problem I have with it is that all of its functions have void return type, although these functions call one another, or are part of another, and the results of their computations are needed.
This is a piece of ( simplified ) code extracted from the libraries. Don't bother about the mathematics in code, it is not significant. Just passing arguments and returning results is what puzzles me ( as described after code ) :
// first function
void vector_math // get the (output) vector we need
(
double inputV[3], // input vector
double outputV[3] // output vector
)
{
// some variable declarations and simple arithmetics
// .....
//
transposeM(matrix1, matrix2, 3, 3 ); // matrix2 is the result
matrixXvector( matrix2, inputV, outputV) // here you get the result, outputV
}
////////
// second function
void transposeM // transposes a matrix
(
std::vector< std::vector<double> > mat1, // input matrix
std::vector< std::vector<double> > &mat2, // transposed matrix
int mat1rows, int mat1columns
)
{
int row,col;
mat2.resize(mat1columns); // rows
for (std::vector< std::vector<double> >::iterator it=mat2.begin(); it !=mat2.end();++it)
it->resize(mat1rows);
for (row = 0; row < mat1rows; row++)
{
for (col = 0; col < mat1columns; col++)
mat2[col][row] = mat1[row][col];
}
}
////////
// third function
void matrixXvector // multiply matrix and vector
(
std::vector< std::vector<double> > inMatrix,
double inVect[3],
double outVect[3]
)
{
int row,col,ktr;
for (row = 0; row <= 2; row++)
{
outVect[row]= 0.0;
for (ktr = 0; ktr <= 2; ktr++)
outVect[row]= outVect[row] + inMatrix[row][ktr] * inVect[ktr];
}
}
So "vector_math" is being called by the main program. It takes inputV as input and the result should be outputV. However, outputV is one of the input arguments, and the function returns void. And similar process occurs later when calling "transposeM" and "matrixXvector".
Why is the output variable one of the input arguments ? How are the results being returned and used for further computation ? How this kind of passing and returning arguments works ?
Since I am a beginner and also have never seen this style of coding, I don't understand how passing parameters and especially giving output works in these functions. Therefore I don't know how to use them and what to expect of them ( what they will actually do ). So I would very much appreciate an explanation that will make these processes clear to me.
EXTRA :
Thank you all for great answers. It was first time I could barely decide which answer to accept, and even as I did it felt unfair to others. I would like to add an extra question though, if anyone is willing to answer ( as a comment is enough ). Does this "old" style of coding input/output arguments have its name or any other expression with which it is referred ?
This is an "old" (but still popular) style of returning certain or multiple values. It works like this:
void copy (const std::vector<double>& input, std::vector<double>& output) {
output = input;
}
int main () {
std::vector<double> old_vector {1,2,3,4,5}, new_vector;
copy (old_vector, new_vector); // new_vector now copy of old_vector
}
So basically you give the function one or multiple output parameter to write the result of its computation to.
If you pass input parameters (i.e. you don't intend to change them) by value or by const reference does not matter, although passing read only arguments by value might be costly performance-wise. In the first case, you copy the input object and use the copy in the function, in the latter you just let the function see the original and prevent it from being modified with the const. The const for the input parameters is optional, but leaving it out allows the function to change their values which might not be what you want, and inhibits passing temporaries as input.
The input parameter(s) have to be passed by non-const reference to allow the function to change it/them.
Another, even older and "C-isher" style is to passing output-pointer or raw-arrays, like the first of your functions does. This is potentially dangerous as the pointer might not point to a valid piece of memory, but still pretty wide spread. It works essentially just like the first example:
// Copies in to int pointed to by out
void copy (int in, int* out) {
*out = in;
}
// Copies int pointed to by in to int pointed to by out
void copy (const int* in, int* out) {
*out = *in;
}
// Copies length ints beginning from in to length ints beginning at out
void copy (const int* in, int* out, std::size_t length) {
// For loop for beginner, use std::copy IRL:
// std::copy(in, in + length, out);
for (std::size_t i = 0; i < length; ++i)
out[i] = in[i];
}
The arrays in your first example basically work like pointers.
Baum's answer is accurate, but perhaps not as detailed as a C/C++ beginner would like.
The actual argument values that go into a function are always passed by value (i.e. a bit pattern) and cannot be changed in a way that is readable by the caller. HOWEVER - and this is the key - those bits in the arguments may in fact be pointers (or references) that don't contain data directly, but rather contain a location in memory that contains the actual value.
Examples: in a function like this:
void foo(double x, double output) { output = x ^ 2; }
naming the output variable "output doesn't change anything - there is no way for the caller to get the result.
But like this:
void foo(double x, double& output) { output = x ^ 2; }
the "&" indicates that the output parameter is a reference to the memory location where the output should be stored. It's syntactic sugar in C++ that is equivalent to this 'C' code:
void foo(double x, double* pointer_to_output) { *pointer_to_output = x ^ 2; }
The pointer dereference is hidden by the reference syntax but the idea is the same.
Arrays perform a similar syntax trick, they are actually passed as pointers, so
void foo(double x[3], double output[3]) { ... }
and
void foo(double* x, double* output) { ... }
are essentially equivalent. Note that in either case there is no way to determine the size of the arrays. Therefore, it is generally considered good practice to pass pointers and lengths:
void foo(double* x, int xlen, double* output, int olen);
Output parameters like this are used in multiple cases. A common one is to return multiple values since the return type of a function can be only a single value. (While you can return an object that contains multiple members, but you can't return multiple separate values directly.)
Another reason why output parameters are used is speed. It's frequently faster to modify the output in place if the object in question is large and/or expensive to construct.
Another programming paradigm is to return a value that indicates the success/failure of the function and return calculated value(s) in output parameters. For example, much of the historic Windows API works this way.
An array is a low-level C++ construct. It is implicitly convertible to a pointer to the memory allocated for the array.
int a[] = {1, 2, 3, 4, 5};
int *p = a; // a can be converted to a pointer
assert(a[0] == *a);
assert(a[1] == *(a + 1));
assert(a[1] == p[1]);
// etc.
The confusing thing about arrays is that a function declaration void foo(int bar[]); is equivalent to void foo(int *bar);. So foo(a) doesn't copy the array a; instead, a is converted to a pointer and the pointer - not the memory - is then copied.
void foo(int bar[]) // could be rewritten as foo(int *bar)
{
bar[0] = 1; // could be rewritten as *(bar + 0) = 1;
}
int main()
{
int a[] = {0};
foo(a);
assert(a[0] == 1);
}
bar points to the same memory that a does so modifying the contents of array pointed to by bar is the same as modifying the contents of array a.
In C++ you can also pass objects by reference (Type &ref;). You can think of references as aliases for a given object. So if you write:
int a = 0;
int &b = a;
b = 1;
assert(a == 1);
b is effectively an alias for a - by modifying b you modify a and vice versa. Functions can also take arguments by reference:
void foo(int &bar)
{
bar = 1;
}
int main()
{
int a = 0;
foo(a);
assert(a == 1);
}
Again, bar is little more than an alias for a, so by modifying bar you will also modify a.
The library of mathematical routines you have is using these features to store results in an input variable. It does so to avoid copies and ease memory management. As mentioned by #Baum mit Augen, the method can also be used as a way to return multiple values.
Consider this code:
vector<int> foo(const vector<int> &bar)
{
vector<int> result;
// calculate the result
return result;
}
While returning result, foo will make a copy of the vector, and depending on number (and size) of elements stored the copy can be very expensive.
Note:
Most compilers will elide the copy in the code above using Named Return Value Optimization (NRVO). In general case, though, you have no guarantee of it happening.
Another way to avoid expensive copies is to create the result object on heap, and return a pointer to the allocated memory:
vector<int> *foo(const vector<int> &bar)
{
vector<int> *result = new vector<int>;
// calculate the result
return result;
}
The caller needs to manage the lifetime of the returned object, calling delete when it's no longer needed. Faililng to do so can result in a memory leak (the memory stays allocated, but effectively unusable, by the application).
Note:
There are various solutions to help with returning (expensive to copy) objects. C++03 has std::auto_ptr wrapper to help with lifetime management of objects created on heap. C++11 adds move semantics to the language, which allow to efficiently return objects by value instead of using pointers.
This question already has answers here:
Initialization of all elements of an array to one default value in C++?
(12 answers)
Closed 4 months ago.
I'm trying to initialize an int array with everything set at -1.
I tried the following, but it doesn't work. It only sets the first value at -1.
int directory[100] = {-1};
Why doesn't it work right?
I'm surprised at all the answers suggesting vector. They aren't even the same thing!
Use std::fill, from <algorithm>:
int directory[100];
std::fill(directory, directory + 100, -1);
Not concerned with the question directly, but you might want a nice helper function when it comes to arrays:
template <typename T, size_t N>
T* end(T (&pX)[N])
{
return pX + N;
}
Giving:
int directory[100];
std::fill(directory, end(directory), -1);
So you don't need to list the size twice.
I would suggest using std::array. For three reasons:
1. array provides runtime safety against index-out-of-bound in subscripting (i.e. operator[]) operations,
2. array automatically carries the size without requiring to pass it separately
3. And most importantly, array provides the fill() method that is required for
this problem
#include <array>
#include <assert.h>
typedef std::array< int, 100 > DirectoryArray;
void test_fill( DirectoryArray const & x, int expected_value ) {
for( size_t i = 0; i < x.size(); ++i ) {
assert( x[ i ] == expected_value );
}
}
int main() {
DirectoryArray directory;
directory.fill( -1 );
test_fill( directory, -1 );
return 0;
}
Using array requires use of "-std=c++0x" for compiling (applies to the above code).
If that is not available or if that is not an option, then the other options like std::fill() (as suggested by GMan) or hand coding the a fill() method may be opted.
If you had a smaller number of elements you could specify them one after the other. Array initialization works by specifying each element, not by specifying a single value that applies for each element.
int x[3] = {-1, -1, -1 };
You could also use a vector and use the constructor to initialize all of the values. You can later access the raw array buffer by specifying &v.front()
std::vector directory(100, -1);
There is a C way to do it also using memset or various other similar functions. memset works for each char in your specified buffer though so it will work fine for values like 0 but may not work depending on how negative numbers are stored for -1.
You can also use STL to initialize your array by using fill_n. For a general purpose action to each element you could use for_each.
fill_n(directory, 100, -1);
Or if you really want you can go the lame way, you can do a for loop with 100 iterations and doing directory[i] = -1;
If you really need arrays, you can use boosts array class. It's assign member does the job:
boost::array<int,N> array; // boost arrays are of fixed size!
array.assign(-1);
It does work right. Your expectation of the initialiser is incorrect. If you really wish to take this approach, you'll need 100 comma-separated -1s in the initialiser. But then what happens when you increase the size of the array?
use vector of int instead a array.
vector<int> directory(100,-1); // 100 ints with value 1
It is working right. That's how list initializers work.
I believe 6.7.8.10 of the C99 standard covers this:
If an object that has automatic
storage duration is not initialized
explicitly, its value is
indeterminate. If an object that has
static storage duration is not
initialized explicitly, then:
if it has pointer type, it is initialized to a null pointer;
if it has arithmetic type, it is initialized to (positive or unsigned)
zero;
if it is an aggregate, every member is initialized (recursively) according
to these rules;
if it is a union, the first named member is initialized (recursively)
according to these rules.
If you need to make all the elements in an array the same non-zero value, you'll have to use a loop or memset.
Also note that, unless you really know what you're doing, vectors are preferred over arrays in C++:
Here's what you need to realize about containers vs. arrays:
Container classes make programmers more productive. So if you insist on using arrays while those around are willing to use container classes, you'll probably be less productive than they are (even if you're smarter and more experienced than they are!).
Container classes let programmers write more robust code. So if you insist on using arrays while those around are willing to use container classes, your code will probably have more bugs than their code (even if you're smarter and more experienced).
And if you're so smart and so experienced that you can use arrays as fast and as safe as they can use container classes, someone else will probably end up maintaining your code and they'll probably introduce bugs. Or worse, you'll be the only one who can maintain your code so management will yank you from development and move you into a full-time maintenance role — just what you always wanted!
There's a lot more to the linked question; give it a read.
u simply use for loop as done below:-
for (int i=0; i<100; i++)
{
a[i]= -1;
}
as a result as u want u can get
A[100]={-1,-1,-1..........(100 times)}
I had the same question and I found how to do, the documentation give the following example :
std::array<int, 3> a1{ {1, 2, 3} }; // double-braces required in C++11 (not in C++14)
So I just tried :
std::array<int, 3> a1{ {1} }; // double-braces required in C++11 (not in C++14)
And it works all elements have 1 as value. It does not work with the = operator. It is maybe a C++11 issue.
Can't do what you're trying to do with a raw array (unless you explicitly list out all 100 -1s in the initializer list), you can do it with a vector:
vector<int> directory(100, -1);
Additionally, you can create the array and set the values to -1 using one of the other methods mentioned.
Just use this loop.
for(int i =0 ; i < 100 ; i++) directory[i] =0;
the almighty memset() will do the job for array and std containers in C/C++/C++11/C++14
The reason that int directory[100] = {-1} doesn't work is because of what happens with array initialization.
All array elements that are not initialized explicitly are initialized implicitly the same way as objects that have static storage duration.
ints which are implicitly initialized are:
initialized to unsigned zero
All array elements that are not initialized explicitly are initialized implicitly the same way as objects that have static storage duration.
C++11 introduced begin and end which are specialized for arrays!
This means that given an array (not just a pointer), like your directory you can use fill as has been suggested in several answers:
fill(begin(directory), end(directory), -1)
Let's say that you write code like this, but then decide to reuse the functionality after having forgotten how you implemented it, but you decided to change the size of directory to 60. If you'd written code using begin and end then you're done.
If on the other hand you'd done this: fill(directory, directory + 100, -1) then you'd better remember to change that 100 to a 60 as well or you'll get undefined behavior.
If you are allowed to use std::array, you can do the following:
#include <iostream>
#include <algorithm>
#include <array>
using namespace std;
template <class Elem, Elem pattern, size_t S, size_t L>
struct S_internal {
template <Elem... values>
static array<Elem, S> init_array() {
return S_internal<Elem, pattern, S, L - 1>::init_array<values..., pattern>();
}
};
template <class Elem, Elem pattern, size_t S>
struct S_internal<Elem, pattern, S, 0> {
template <Elem... values>
static array<Elem, S> init_array() {
static_assert(S == sizeof...(values), "");
return array<Elem, S> {{values...}};
}
};
template <class Elem, Elem pattern, size_t S>
struct init_array
{
static array<Elem, S> get() {
return S_internal<Elem, pattern, S, S>::init_array<>();
}
};
void main()
{
array<int, 5> ss = init_array<int, 77, 5>::get();
copy(cbegin(ss), cend(ss), ostream_iterator<int>(cout, " "));
}
The output is:
77 77 77 77 77
Just use the fill_n() method.
Example
int n;
cin>>n;
int arr[n];
int value = 9;
fill_n(arr, n, value); // 9 9 9 9 9...
Learn More about fill_n()
or
you can use the fill() method.
Example
int n;
cin>>n;
int arr[n];
int value = 9;
fill(arr, arr+n, value); // 9 9 9 9 9...
Learn More about fill() method.
Note: Both these methods are available in algorithm library (#include<algorithm>). Don't forget to include it.
Starting with C++11 you could also use a range based loop:
int directory[10];
for (auto& value: directory) value = -1;
I myself am convinced that in a project I'm working on signed integers are the best choice in the majority of cases, even though the value contained within can never be negative. (Simpler reverse for loops, less chance for bugs, etc., in particular for integers which can only hold values between 0 and, say, 20, anyway.)
The majority of the places where this goes wrong is a simple iteration of a std::vector, often this used to be an array in the past and has been changed to a std::vector later. So these loops generally look like this:
for (int i = 0; i < someVector.size(); ++i) { /* do stuff */ }
Because this pattern is used so often, the amount of compiler warning spam about this comparison between signed and unsigned type tends to hide more useful warnings. Note that we definitely do not have vectors with more then INT_MAX elements, and note that until now we used two ways to fix compiler warning:
for (unsigned i = 0; i < someVector.size(); ++i) { /*do stuff*/ }
This usually works but might silently break if the loop contains any code like 'if (i-1 >= 0) ...', etc.
for (int i = 0; i < static_cast<int>(someVector.size()); ++i) { /*do stuff*/ }
This change does not have any side effects, but it does make the loop a lot less readable. (And it's more typing.)
So I came up with the following idea:
template <typename T> struct vector : public std::vector<T>
{
typedef std::vector<T> base;
int size() const { return base::size(); }
int max_size() const { return base::max_size(); }
int capacity() const { return base::capacity(); }
vector() : base() {}
vector(int n) : base(n) {}
vector(int n, const T& t) : base(n, t) {}
vector(const base& other) : base(other) {}
};
template <typename Key, typename Data> struct map : public std::map<Key, Data>
{
typedef std::map<Key, Data> base;
typedef typename base::key_compare key_compare;
int size() const { return base::size(); }
int max_size() const { return base::max_size(); }
int erase(const Key& k) { return base::erase(k); }
int count(const Key& k) { return base::count(k); }
map() : base() {}
map(const key_compare& comp) : base(comp) {}
template <class InputIterator> map(InputIterator f, InputIterator l) : base(f, l) {}
template <class InputIterator> map(InputIterator f, InputIterator l, const key_compare& comp) : base(f, l, comp) {}
map(const base& other) : base(other) {}
};
// TODO: similar code for other container types
What you see is basically the STL classes with the methods which return size_type overridden to return just 'int'. The constructors are needed because these aren't inherited.
What would you think of this as a developer, if you'd see a solution like this in an existing codebase?
Would you think 'whaa, they're redefining the STL, what a huge WTF!', or would you think this is a nice simple solution to prevent bugs and increase readability. Or maybe you'd rather see we had spent (half) a day or so on changing all these loops to use std::vector<>::iterator?
(In particular if this solution was combined with banning the use of unsigned types for anything but raw data (e.g. unsigned char) and bit masks.)
Don't derive publicly from STL containers. They have nonvirtual destructors which invokes undefined behaviour if anyone deletes one of your objects through a pointer-to base. If you must derive e.g. from a vector, do it privately and expose the parts you need to expose with using declarations.
Here, I'd just use a size_t as the loop variable. It's simple and readable. The poster who commented that using an int index exposes you as a n00b is correct. However, using an iterator to loop over a vector exposes you as a slightly more experienced n00b - one who doesn't realize that the subscript operator for vector is constant time. (vector<T>::size_type is accurate, but needlessly verbose IMO).
While I don't think "use iterators, otherwise you look n00b" is a good solution to the problem, deriving from std::vector appears much worse than that.
First, developers do expect vector to be std:.vector, and map to be std::map. Second, your solution does not scale for other containers, or for other classes/libraries that interact with containers.
Yes, iterators are ugly, iterator loops are not very well readable, and typedefs only cover up the mess. But at least, they do scale, and they are the canonical solution.
My solution? an stl-for-each macro. That is not without problems (mainly, it is a macro, yuck), but it gets across the meaning. It is not as advanced as e.g. this one, but does the job.
I made this community wiki... Please edit it. I don't agree with the advice against "int" anymore. I now see it as not bad.
Yes, i agree with Richard. You should never use 'int' as the counting variable in a loop like those. The following is how you might want to do various loops using indices (althought there is little reason to, occasionally this can be useful).
Forward
for(std::vector<int>::size_type i = 0; i < someVector.size(); i++) {
/* ... */
}
Backward
You can do this, which is perfectly defined behaivor:
for(std::vector<int>::size_type i = someVector.size() - 1;
i != (std::vector<int>::size_type) -1; i--) {
/* ... */
}
Soon, with c++1x (next C++ version) coming along nicely, you can do it like this:
for(auto i = someVector.size() - 1; i != (decltype(i)) -1; i--) {
/* ... */
}
Decrementing below 0 will cause i to wrap around, because it is unsigned.
But unsigned will make bugs slurp in
That should never be an argument to make it the wrong way (using 'int').
Why not use std::size_t above?
The C++ Standard defines in 23.1 p5 Container Requirements, that T::size_type , for T being some Container, that this type is some implementation defined unsigned integral type. Now, using std::size_t for i above will let bugs slurp in silently. If T::size_type is less or greater than std::size_t, then it will overflow i, or not even get up to (std::size_t)-1 if someVector.size() == 0. Likewise, the condition of the loop would have been broken completely.
Definitely use an iterator. Soon you will be able to use the 'auto' type, for better readability (one of your concerns) like this:
for (auto i = someVector.begin();
i != someVector.end();
++i)
Skip the index
The easiest approach is to sidestep the problem by using iterators, range-based for loops, or algorithms:
for (auto it = begin(v); it != end(v); ++it) { ... }
for (const auto &x : v) { ... }
std::for_each(v.begin(), v.end(), ...);
This is a nice solution if you don't actually need the index value. It also handles reverse loops easily.
Use an appropriate unsigned type
Another approach is to use the container's size type.
for (std::vector<T>::size_type i = 0; i < v.size(); ++i) { ... }
You can also use std::size_t (from <cstddef>). There are those who (correctly) point out that std::size_t may not be the same type as std::vector<T>::size_type (though it usually is). You can, however, be assured that the container's size_type will fit in a std::size_t. So everything is fine, unless you use certain styles for reverse loops. My preferred style for a reverse loop is this:
for (std::size_t i = v.size(); i-- > 0; ) { ... }
With this style, you can safely use std::size_t, even if it's a larger type than std::vector<T>::size_type. The style of reverse loops shown in some of the other answers require casting a -1 to exactly the right type and thus cannot use the easier-to-type std::size_t.
Use a signed type (carefully!)
If you really want to use a signed type (or if your style guide practically demands one), like int, then you can use this tiny function template that checks the underlying assumption in debug builds and makes the conversion explicit so that you don't get the compiler warning message:
#include <cassert>
#include <cstddef>
#include <limits>
template <typename ContainerType>
constexpr int size_as_int(const ContainerType &c) {
const auto size = c.size(); // if no auto, use `typename ContainerType::size_type`
assert(size <= static_cast<std::size_t>(std::numeric_limits<int>::max()));
return static_cast<int>(size);
}
Now you can write:
for (int i = 0; i < size_as_int(v); ++i) { ... }
Or reverse loops in the traditional manner:
for (int i = size_as_int(v) - 1; i >= 0; --i) { ... }
The size_as_int trick is only slightly more typing than the loops with the implicit conversions, you get the underlying assumption checked at runtime, you silence the compiler warning with the explicit cast, you get the same speed as non-debug builds because it will almost certainly be inlined, and the optimized object code shouldn't be any larger because the template doesn't do anything the compiler wasn't already doing implicitly.
You're overthinking the problem.
Using a size_t variable is preferable, but if you don't trust your programmers to use unsigned correctly, go with the cast and just deal with the ugliness. Get an intern to change them all and don't worry about it after that. Turn on warnings as errors and no new ones will creep in. Your loops may be "ugly" now, but you can understand that as the consequences of your religious stance on signed versus unsigned.
vector.size() returns a size_t var, so just change int to size_t and it should be fine.
Richard's answer is more correct, except that it's a lot of work for a simple loop.
I notice that people have very different opinions about this subject. I have also an opinion which does not convince others, so it makes sense to search for support by some guru’s, and I found the CPP core guidelines:
https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines
maintained by Bjarne Stroustrup and Herb Sutter, and their last update, upon which I base the information below, is of April 10, 2022.
Please take a look at the following code rules:
ES.100: Don’t mix signed and unsigned arithmetic
ES.101: Use unsigned types for bit manipulation
ES.102: Use signed types for arithmetic
ES.107: Don’t use unsigned for subscripts, prefer gsl::index
So, supposing that we want to index in a for loop and for some reason the range based for loop is not the appropriate solution, then using an unsigned type is also not the preferred solution. The suggested solution is using gsl::index.
But in case you don’t have gsl around and you don’t want to introduce it, what then?
In that case I would suggest to have a utility template function as suggested by Adrian McCarthy: size_as_int