Say I have the class
class A{
int value;
public:
A(int val) : value(val) {};
}
I store pointers of instances in a collection such as vector using for loop
std::vector<A*> myCollection;
for (int i = 0; i < 10; ++i){
myCollection.push_back(&A(i));
}
Now the for loop will construct and destruct an object at the same memory location resulting in a vector with 10 pointers pointing to the same address and dereferencing them will give A->value = 9.
Is there any way around this without dynamic allocation? And yes I have to use collection of pointers and not references.
If the objects need to be on the stack, but you also want a vector of pointers because of some API requirement, etc., Just create an array of the objects, then store the pointers. Be very mindful of the lifetime issues.
size_t const sz = 3;
A arr[sz] {1, 2, 3};
std::vector<A*> v;
v.reserve(sz);
for (auto& a : arr) v.push_back(&a);
someFunc(v);
The problem with your current program is that A(i) is a prvalue and hence its address using the & operator cannot be taken. This means that the following expression is invalid in your code:
//---------------------vvvvv----->invalid because A(i) is a prvalue
myCollection.push_back(&A(i));
You could instead use a std::vector<A> in addtion to std::vector<A*> as shown below:
std::vector<A> myVector;
myVector.reserve(10);
//-------^^^^^^^---------------->to avoid reallocations when capacity is not enough
for (int i = 0; i < 10; ++i){
myVector.emplace_back(i);
//-----------^^^^^^^^^^^^------->use emplace_back to forward the argument i
}
std::vector<A*> myPtrVector;
myPtrVector.reserve(10);
for(auto&elem: myVector)
{
myPtrVector.push_back(&elem);
}
Related
I am fairly new to C++ and have been avoiding pointers. From what I've read online I cannot return an array but I can return a pointer to it. I made a small code to test it and was wondering if this was the normal / correct way to do this:
#include <iostream>
using namespace std;
int* test (int in[5]) {
int* out = in;
return out;
}
int main() {
int arr[5] = {1, 2, 3, 4, 5};
int* pArr = test(arr);
for (int i = 0; i < 5; i++) cout<<pArr[i]<<endl;
cout<<endl;
return 0;
}
Edit: This seems to be no good. How should I rewrite it?
int* test (int a[5], int b[5]) {
int c[5];
for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
int* out = c;
return out;
}
Your code as it stands is correct but I am having a hard time figuring out how it could/would be used in a real world scenario. With that said, please be aware of a few caveats when returning pointers from functions:
When you create an array with syntax int arr[5];, it's allocated on the stack and is local to the function.
C++ allows you to return a pointer to this array, but it is undefined behavior to use the memory pointed to by this pointer outside of its local scope. Read this great answer using a real world analogy to get a much clear understanding than what I could ever explain.
You can still use the array outside the scope if you can guarantee that memory of the array has not be purged. In your case this is true when you pass arr to test().
If you want to pass around pointers to a dynamically allocated array without worrying about memory leaks, you should do some reading on std::unique_ptr/std::shared_ptr<>.
Edit - to answer the use-case of matrix multiplication
You have two options. The naive way is to use std::unique_ptr/std::shared_ptr<>. The Modern C++ way is to have a Matrix class where you overload operator * and you absolutely must use the new rvalue references if you want to avoid copying the result of the multiplication to get it out of the function. In addition to having your copy constructor, operator = and destructor, you also need to have move constructor and move assignment operator. Go through the questions and answers of this search to gain more insight on how to achieve this.
Edit 2 - answer to appended question
int* test (int a[5], int b[5]) {
int *c = new int[5];
for (int i = 0; i < 5; i++)
c[i] = a[i]+b[i];
return c;
}
If you are using this as int *res = test(a,b);, then sometime later in your code, you should call delete []res to free the memory allocated in the test() function. You see now the problem is it is extremely hard to manually keep track of when to make the call to delete. Hence the approaches on how to deal with it where outlined in the answer.
Your code is OK. Note though that if you return a pointer to an array, and that array goes out of scope, you should not use that pointer anymore. Example:
int* test (void)
{
int out[5];
return out;
}
The above will never work, because out does not exist anymore when test() returns. The returned pointer must not be used anymore. If you do use it, you will be reading/writing to memory you shouldn't.
In your original code, the arr array goes out of scope when main() returns. Obviously that's no problem, since returning from main() also means that your program is terminating.
If you want something that will stick around and cannot go out of scope, you should allocate it with new:
int* test (void)
{
int* out = new int[5];
return out;
}
The returned pointer will always be valid. Remember do delete it again when you're done with it though, using delete[]:
int* array = test();
// ...
// Done with the array.
delete[] array;
Deleting it is the only way to reclaim the memory it uses.
New answer to new question:
You cannot return pointer to automatic variable (int c[5]) from the function. Automatic variable ends its lifetime with return enclosing block (function in this case) - so you are returning pointer to not existing array.
Either make your variable dynamic:
int* test (int a[5], int b[5]) {
int* c = new int[5];
for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
return c;
}
Or change your implementation to use std::array:
std::array<int,5> test (const std::array<int,5>& a, const std::array<int,5>& b)
{
std::array<int,5> c;
for (int i = 0; i < 5; i++) c[i] = a[i]+b[i];
return c;
}
In case your compiler does not provide std::array you can replace it with simple struct containing an array:
struct array_int_5 {
int data[5];
int& operator [](int i) { return data[i]; }
int operator const [](int i) { return data[i]; }
};
Old answer to old question:
Your code is correct, and ... hmm, well, ... useless. Since arrays can be assigned to pointers without extra function (note that you are already using this in your function):
int arr[5] = {1, 2, 3, 4, 5};
//int* pArr = test(arr);
int* pArr = arr;
Morever signature of your function:
int* test (int in[5])
Is equivalent to:
int* test (int* in)
So you see it makes no sense.
However this signature takes an array, not pointer:
int* test (int (&in)[5])
A variable referencing an array is basically a pointer to its first element, so yes, you can legitimately return a pointer to an array, because thery're essentially the same thing. Check this out yourself:
#include <assert.h>
int main() {
int a[] = {1, 2, 3, 4, 5};
int* pArr = a;
int* pFirstElem = &(a[0]);
assert(a == pArr);
assert(a == pFirstElem);
return 0;
}
This also means that passing an array to a function should be done via pointer (and not via int in[5]), and possibly along with the length of the array:
int* test(int* in, int len) {
int* out = in;
return out;
}
That said, you're right that using pointers (without fully understanding them) is pretty dangerous. For example, referencing an array that was allocated on the stack and went out of scope yields undefined behavior:
#include <iostream>
using namespace std;
int main() {
int* pArr = 0;
{
int a[] = {1, 2, 3, 4, 5};
pArr = a; // or test(a) if you wish
}
// a[] went out of scope here, but pArr holds a pointer to it
// all bets are off, this can output "1", output 1st chapter
// of "Romeo and Juliet", crash the program or destroy the
// universe
cout << pArr[0] << endl; // WRONG!
return 0;
}
So if you don't feel competent enough, just use std::vector.
[answer to the updated question]
The correct way to write your test function is either this:
void test(int* a, int* b, int* c, int len) {
for (int i = 0; i < len; ++i) c[i] = a[i] + b[i];
}
...
int main() {
int a[5] = {...}, b[5] = {...}, c[5] = {};
test(a, b, c, 5);
// c now holds the result
}
Or this (using std::vector):
#include <vector>
vector<int> test(const vector<int>& a, const vector<int>& b) {
vector<int> result(a.size());
for (int i = 0; i < a.size(); ++i) {
result[i] = a[i] + b[i];
}
return result; // copy will be elided
}
In a real app, the way you returned the array is called using an out parameter. Of course you don't actually have to return a pointer to the array, because the caller already has it, you just need to fill in the array. It's also common to pass another argument specifying the size of the array so as to not overflow it.
Using an out parameter has the disadvantage that the caller may not know how large the array needs to be to store the result. In that case, you can return a std::vector or similar array class instance.
Your code (which looks ok) doesn't return a pointer to an array. It returns a pointer to the first element of an array.
In fact that's usually what you want to do. Most manipulation of arrays are done via pointers to individual elements, not via pointers to the array as a whole.
You can define a pointer to an array, for example this:
double (*p)[42];
defines p as a pointer to a 42-element array of doubles. A big problem with that is that you have to specify the number of elements in the array as part of the type -- and that number has to be a compile-time constant. Most programs that deal with arrays need to deal with arrays of varying sizes; a given array's size won't vary after it's been created, but its initial size isn't necessarily known at compile time, and different array objects can have different sizes.
A pointer to the first element of an array lets you use either pointer arithmetic or the indexing operator [] to traverse the elements of the array. But the pointer doesn't tell you how many elements the array has; you generally have to keep track of that yourself.
If a function needs to create an array and return a pointer to its first element, you have to manage the storage for that array yourself, in one of several ways. You can have the caller pass in a pointer to (the first element of) an array object, probably along with another argument specifying its size -- which means the caller has to know how big the array needs to be. Or the function can return a pointer to (the first element of) a static array defined inside the function -- which means the size of the array is fixed, and the same array will be clobbered by a second call to the function. Or the function can allocate the array on the heap -- which makes the caller responsible for deallocating it later.
Everything I've written so far is common to C and C++, and in fact it's much more in the style of C than C++. Section 6 of the comp.lang.c FAQ discusses the behavior of arrays and pointers in C.
But if you're writing in C++, you're probably better off using C++ idioms. For example, the C++ standard library provides a number of headers defining container classes such as <vector> and <array>, which will take care of most of this stuff for you. Unless you have a particular reason to use raw arrays and pointers, you're probably better off just using C++ containers instead.
EDIT : I think you edited your question as I was typing this answer. The new code at the end of your question is, as you observer, no good; it returns a pointer to an object that ceases to exist as soon as the function returns. I think I've covered the alternatives.
you can (sort of) return an array
instead of
int m1[5] = {1, 2, 3, 4, 5};
int m2[5] = {6, 7, 8, 9, 10};
int* m3 = test(m1, m2);
write
struct mystruct
{
int arr[5];
};
int m1[5] = {1, 2, 3, 4, 5};
int m2[5] = {6, 7, 8, 9, 10};
mystruct m3 = test(m1,m2);
where test looks like
struct mystruct test(int m1[5], int m2[5])
{
struct mystruct s;
for (int i = 0; i < 5; ++i ) s.arr[i]=m1[i]+m2[i];
return s;
}
not very efficient since one is copying it delivers a copy of the array
This code below doesn't work because I push_back the vectors a and b to the vector vector and then alter the vectors a and b. I want to alter the vectors a and b so that the vector vector suffers the same modifications. How do I do this?
#include <iostream>
#include <vector>
int main()
{
std::vector<std::vector<int>>vector;
std::vector<int>a;
std::vector<int>b;
vector.push_back(a);
vector.push_back(b);
for (int i = 1; i <= 10; i++)
a.push_back(i);
for (int i = 11; i <= 20; i++)
b.push_back(i);
std::cout << vector[1][0];
std::cin.get();
}
You can use std::reference_wrapper (since C++11).
std::reference_wrapper is a class template that wraps a reference in a copyable, assignable object. It is frequently used as a mechanism to store references inside standard containers (like std::vector) which cannot normally hold references.
e.g.
std::vector<std::reference_wrapper<std::vector<int>>> v;
std::vector<int> a;
std::vector<int> b;
v.push_back(a);
v.push_back(b);
for (int i = 1; i <= 10; i++)
a.push_back(i);
for (int i = 11; i <= 20; i++)
b.push_back(i);
std::cout << v[1].get()[0]; //11
LIVE
Note that if the vector has longer timelife than a and b, then when a and b get destroyed the references stored in the vector become dangled.
Create v (vector is not a good name since it shares with the library and makes the code confusing) to be a vector of vector pointers (since a vector of references is not possible):
std::vector<std::vector<int> *> v; //declare as vec of vec pointers
...
v.push_back(&a); //push_back addresses of a and b
v.push_back(&b);
...
std::cout << v.at(1)->at(0) //dereference and call at on the inner vec
Note that this can be dangerous if a or b go out of scope before v, as that will leave you with dangling pointers, a mess of undefined behavior and a murder time-consuming bugs.
The basic issue is that push_back copies its parameter to the end of the vector. To modify the object in the vector, you need to get a reference to it. One approach:
std::vector< std::vector<int> > my_vector;
my_vector.reserve(2); // Going over the allocation invalidates references
my_vector.push_back( std::vector<int>() );
std::vector<int> & a = my_vector.back();
my_vector.push_back( std::vector<int>() );
std::vector<int> & b = my_vector.back();
(I changed the name of the variable because using "vector" as a variable name tends to lead to confusion.)
If you can use C++17, there is a way to reduce the lines of code using emplace_back.
If you know the number of vectors ahead of time you can do it like this:
std::vector<std::vector<int>> v(2);
std::vector<int> &a = v[0];
std::vector<int> &b = v[1];
...
Consider the following code:
#include <iostream>
#include <vector>
using namespace std;
class SomeClass {
public:
SomeClass(int num) : val_(num) {}
int val_;
int val() const { return val_; }
};
// Given a vector of vector of numbers, this class will generate a vector of vector of pointers
// that point to SomeClass.
class Generator {
public:
vector<SomeClass> objects_;
vector<vector<SomeClass*> > Generate(const vector<vector<int> >& input) {
vector<vector<SomeClass*> > out;
for (const auto& vec : input) {
out.push_back({});
for (const int num : vec) {
SomeClass s(num);
objects_.push_back(s);
out.back().push_back(&objects_.back());
}
}
return out;
}
};
int main() {
Generator generator;
auto output = generator.Generate({{2, 3}, {4, 5}, {6}});
for (const auto& vec : output) {
for (const auto* obj : vec) {
printf("%d ",obj->val());
}
printf("\n");
}
return 0;
}
The Generate method in the Generator class will simply convert the vector of vector of ints to a vector of vector of pointers to SomeClass.
SomeClass is simply a container for a simple int value with a getter method.
I would expect the following output:
2 3
4 5
6
However, I get the following output:
junk_integer junk_integer
4 5
6
It seems the pointers in the first row become dangling pointers. What is wrong with this code?
You're storing pointers into a vector, then adding elements to the vector. Since you're not reserving enough space for all the elements you're adding, when the vector resizes it invalidates all the pointers to the old data.
You'll either have to reserve enough space before storing the pointers, store the pointers after you've stored everything in the vector you need to store, or not store pointers (maybe store an index, instead).
All operations that increase the number of elements in a std::vector, including push_back(), invalidates all iterators (including pointers) that refer to elements of the vector, if the resizing results in a change of vector capacity.
Your code is doing
objects_.push_back(s);
out.back().push_back(&objects_.back());
within a loop. Every call of objects_.push_back() invalidates iterators of objects_, and therefore can result in out.back() containing invalid (dangling) pointers.
You are storing pointers to objects contained in generator.objects_. Some of them become dangling pointers when you call push_back() on that.
In general, storing pointers to objects in a std::vector is a bad idea.
I've a small code snippet which produces an error on running. The following code stores data of mystruct and creates a vector of pointers which store the addresses of the corresponding data.
struct mystruct{
std::vector < int > someinfo;
int somenumbers;
double *somepointer;
double someparm;
void Print(){....}
};
void DoSomething(mystruct &_somestruct, std::vector< mystruct > &_somevec,
std::vector<mystruct *> &_ptr){
_somevec.push_back(_somestruct);
_ptr.push_back(&(_somevec.back()));
}
void ReadStruct(){
std::vector<mystruct > _vec;
std::vector<mystruct *> _ptr;
for(int i=0;i<100;i++){
mystruct _struct;
_struct.somenumbers = 3;
_struct.someinfo.push_back(0);
_struct.someinfo.push_back(1);
_struct.someinfo.push_back(2);
DoSomething(_struct, _vec, _ptr);
}
_vec[0].Print(); //Prints correctly
_ptr[0]->Print();//Prints garbage info
}
If I first create the vector and then create the vector of pointers then the code works perfectly i.e.
void DoSomething(mystruct &_somestruct, std::vector< mystruct > &_somevec){
_somevec.push_back(_somestruct);
//_ptr.push_back(&(_somevec.back()));
}
void DoSomething1(std::vector< mystruct > &_somevec, std::vector<mystruct *> &_ptr){
for(int i=0;i<_somevec.size();i++)
_ptr.push_back(&(_somevec[i]));
}
I do not know what mistake I am doing. Your help/inputs is greatly appreciated!!
This does not look very safe to me.
In _ptr.push_back(&(_somevec.back())); you are taking the address of an element in _somevec. But when _somevec is changed by e.g. a push_back the address will be invalid!
For example this will not work:
std::vector<int> v;
v.push_back(17);
int* p = &v.back(); // pointer is valid
for(int i=0; i<10; i++) v.push_back(0);
*p = 42; // ERROR: p is no longer valid!
You can make this a bit better by using reserve prior to using push_back.
std::vector<int> v;
v.reserve(100); // v will now have enough room to not reallocate memory
v.push_back(17);
int* p = &v.back(); // pointer is valid
for(int i=0; i<10; i++) v.push_back(0);
*p = 42; // (probably) OK: p is still valid
But I would not recommend to do this.
It looks as if you are inserting elements into one vector and reference them with pointers from another vector. However, there doesn't seem to be any precaution against the vector of objects running out of capacity and relocating the objects, invalidating all the pointer.
The easiest approach to verify this hypothesis is to see if the capacity() of the vector of objects ever changes. The easiest fix is probably to use a std::deque<mystruct> as a std::deque<T> doesn't relocate its objects when adding/removing objects at either end.
as in the title is it possible to join a number of arrays together without copying and only using pointers? I'm spending a significant amount of computation time copying smaller arrays into larger ones.
note I can't used vectors since umfpack (some matrix solving library) does not allow me to or i don't know how.
As an example:
int n = 5;
// dynamically allocate array with use of pointer
int *a = new int[n];
// define array pointed by *a as [1 2 3 4 5]
for(int i=0;i<n;i++) {
a[i]=i+1;
}
// pointer to array of pointers ??? --> this does not work
int *large_a = new int[4];
for(int i=0;i<4;i++) {
large_a[i] = a;
}
Note: There is already a simple solution I know and that is just to iteratively copy them to a new large array, but would be nice to know if there is no need to copy repeated blocks that are stored throughout the duration of the program. I'm in a learning curve atm.
thanks for reading everyone
as in the title is it possible to join a number of arrays together without copying and only using pointers?
In short, no.
A pointer is simply an address into memory - like a street address. You can't move two houses next to each other, just by copying their addresses around. Nor can you move two houses together by changing their addresses. Changing the address doesn't move the house, it points to a new house.
note I can't used vectors since umfpack (some matrix solving library) does not allow me to or i don't know how.
In most cases, you can pass the address of the first element of a std::vector when an array is expected.
std::vector a = {0, 1, 2}; // C++0x initialization
void c_fn_call(int*);
c_fn_call(&a[0]);
This works because vector guarantees that the storage for its contents is always contiguous.
However, when you insert or erase an element from a vector, it invalidates pointers and iterators that came from it. Any pointers you might have gotten from taking an element's address no longer point to the vector, if the storage that it has allocated must change size.
No. The memory of two arrays are not necessarily contiguous so there is no way to join them without copying. And array elements must be in contiguous memory...or pointer access would not be possible.
I'd probably use memcpy/memmove, which is still going to be copying the memory around, but at least it's been optimized and tested by your compiler vendor.
Of course, the "real" C++ way of doing it would be to use standard containers and iterators. If you've got memory scattered all over the place like this, it sounds like a better idea to me to use a linked list, unless you are going to do a lot of random access operations.
Also, keep in mind that if you use pointers and dynamically allocated arrays instead of standard containers, it's a lot easier to cause memory leaks and other problems. I know sometimes you don't have a choice, but just saying.
If you want to join arrays without copying the elements and at the same time you want to access the elements using subscript operator i.e [], then that isn't possible without writing a class which encapsulates all such functionalities.
I wrote the following class with minimal consideration, but it demonstrates the basic idea, which you can further edit if you want it to have functionalities which it's not currently having. There should be few error also, which I didn't write, just to make it look shorter, but I believe you will understand the code, and handle error cases accordingly.
template<typename T>
class joinable_array
{
std::vector<T*> m_data;
std::vector<size_t> m_size;
size_t m_allsize;
public:
joinable_array() : m_allsize() { }
joinable_array(T *a, size_t len) : m_allsize() { join(a,len);}
void join(T *a, size_t len)
{
m_data.push_back(a);
m_size.push_back(len);
m_allsize += len;
}
T & operator[](size_t i)
{
index ix = get_index(i);
return m_data[ix.v][ix.i];
}
const T & operator[](size_t i) const
{
index ix = get_index(i);
return m_data[ix.v][ix.i];
}
size_t size() const { return m_allsize; }
private:
struct index
{
size_t v;
size_t i;
};
index get_index(size_t i) const
{
index ix = { 0, i};
for(auto it = m_size.begin(); it != m_size.end(); it++)
{
if ( ix.i >= *it ) { ix.i -= *it; ix.v++; }
else break;
}
return ix;
}
};
And here is one test code:
#define alen(a) sizeof(a)/sizeof(*a)
int main() {
int a[] = {1,2,3,4,5,6};
int b[] = {11,12,13,14,15,16,17,18};
joinable_array<int> arr(a,alen(a));
arr.join(b, alen(b));
arr.join(a, alen(a)); //join it again!
for(size_t i = 0 ; i < arr.size() ; i++ )
std::cout << arr[i] << " ";
}
Output:
1 2 3 4 5 6 11 12 13 14 15 16 17 18 1 2 3 4 5 6
Online demo : http://ideone.com/VRSJI
Here's how to do it properly:
template<class T, class K1, class K2>
class JoinArray {
JoinArray(K1 &k1, K2 &k2) : k1(k1), k2(k2) { }
T operator[](int i) const { int s = k1.size(); if (i < s) return k1.operator[](i); else return k2.operator[](i-s); }
int size() const { return k1.size() + k2.size(); }
private:
K1 &k1;
K2 &k2;
};
template<class T, class K1, class K2>
JoinArray<T,K1,K2> join(K1 &k1, K2 &k2) { return JoinArray<T,K1,K2>(k1,k2); }
template<class T>
class NativeArray
{
NativeArray(T *ptr, int size) : ptr(ptr), size(size) { }
T operator[](int i) const { return ptr[i]; }
int size() const { return size; }
private:
T *ptr;
int size;
};
int main() {
int array[2] = { 0,1 };
int array2[2] = { 2,3 };
NativeArray<int> na(array, 2);
NativeArray<int> na2(array2, 2);
auto joinarray = join(na,na2);
}
A variable that is a pointer to a pointer must be declared as such.
This is done by placing an additional asterik in front of its name.
Hence, int **large_a = new int*[4]; Your large_a goes and find a pointer, while you've defined it as a pointer to an int. It should be defined (declared) as a pointer to a pointer variable. Just as int **large_a; could be enough.