C++ default behavior for arrays - shallow copies? - c++

As the title says - I need the content of the v array not to change outside of the function.
I thought that it would not happen (because AFAIK the default behavior of arrays is to deep copy - but it looks like it is not).
#include <iostream>
using namespace std;
void testModif(int *v)
{
for (int i = 0; i < 5; i++)
{
v[i]++;
}
}
int main()
{
int *v;
v = new int[100];
*v = 0;
testModif(v);
for (int i = 0; i < 5; i++)
{
cout << v[i] << " ";
}
}
How can I make v not change after running testModif?

First you can enlist the compilers help by declaring as
void testModif(const int *v);
I.e. as v being a non-const pointer to const int. I.e. you can e.g. increment the copy of the pointer, but not change what the pointer points to.
With that change the shown code will get errors/warnings, which at least tell you that you are doing what you do not want to do.
Then you need to change your code to follow your self-decided rule.
For the shown code that would require making a copy of the array itself (not only of the pointer pointing to it). Exactly, that would match "deep copy" in contrast to the "shallow copy" you mention.
Most closely to your shown code would be a local copy, an array of size 5 (magic number, but you know what I mean) or something malloced of proper size (and of course freed before leaving the function).
Obviously making a local copy, incrementing all members and then leaving the function again normally would seem pointless. I trust that this is purely to be blamed to you making an appropriate MRE. I.e. the point that intentionally not changing the original values and only locally incrementing is what you want to achieve.
Of course I am not contradicting the comments to your question, which recommend the more C++ way of doing things, with standard containers, like std::array, std::vector.

Related

Variable as reference not staying

So i read this thread and many others:
Function does not change passed pointer C++
Yet i still can't solve my issue.
I have a function declared like this:
void test(list<int*> *listNodes){
int v=5;
(*listNodes).push_back(&v);
(*listNodes).push_back(&v);
(*listNodes).push_back(&v);
for(int a = 0; a < (*listNodes).size(); a ++){
std::list<int*>::iterator i = (*listNodes).begin();
advance(i, a);
int *totry = *i;
cout << *totry;
cout << ",";
}
}
Wich works, and prints fine, by the i mean: the listNodes variable has 3 elements, all 5's. However, when this functions returns, the values are not updated. By that, i mean that the variable has trash. I call this function in another one like this:
void create(list<int*> listNodes){
test(&listNodes);
for(list<int*>::const_iterator it=listNodes.begin();
it!=listNodes.end(); it++){
int *show=*it;
cout << *show << '\n';
}
}
Again, in this function, the cout will output memory garbage instead of outputting the 3 fives.
Any ideas on how should i proceed to, when the function test comes back, i have the list populated?
The problem I believe you're thinking about (as opposed to other problems in this code) is not actually what you're thinking. The list DOES maintain its values, the problem is that the values it has are pointing to garbage memory.
When you do this:
int v=5;
(*listNodes).push_back(&v);
(*listNodes).push_back(&v);
(*listNodes).push_back(&v);
You are putting three copies of the address of v into the list. You have declared v as a stack variable that only exists for the duration of this function. When you print the values pointed to by the elements of listNodes inside function test, that variable still exists in that memory location.
When you later print out the values pointed to by the elements of listNodes in function create, that variable has gone out of scope and has been used by something else, hence the garbage.
Here are two possible solutions to consider:
Use list<int> instead of list<int *>. If all you want to do is store a list of integers, this is the way to go.
If, on the other hand, you really need to store pointers to those integers, you'll need to allocate memory off the heap:
int* v = new int(); // allocate an int on the heap
*v = 5; // store 5 in that int
(*listNodes).push_back(v); // save the pointer to the allocated
// memory in *listNodes
etc
This is not very good in terms of modern c++, however, as you generally don't want to be handling raw pointers at all, but it illustrates the point I think you are struggling with.
In this code,
void create(list<int*> listNodes){
listNodes=teste(&listNodes);
… the formal argument listNodes is passed by value. That means that the function receives a copy of whatever was passed as actual argument in a call siste. Changes to this copy will not be reflected in the actual argument.
The call to teste won't call the test function, since it's a different name.
In a way that's good, because test is declared as a void function so it can't return anything.
But it's also bad, because it means that a very crucial piece of your code, the teste function that's actually called, isn't shown at all in your question.
The test function,
void test(list<int*> *listNodes){
int v=5;
(*listNodes).push_back(&v);
for(int a = 0; a < (*listNodes).size(); a ++){
std::list<int*>::iterator i = (*listNodes).begin();
advance(i, a);
int *totry = *i;
cout << *totry;
cout << ",";
}
printf("\n");
}
… has a lot wrong with it.
Starting at the top, in C++ the pointer argument
void test(list<int*> *listNodes){
… should better be a pass-by-reference argument. A pointer can be null. That doesn't make sense for this function, and the code is not prepared to handle that.
Next, in
int v=5;
(*listNodes).push_back(&v);
… the address of a local variable is pushed on a list that's returned. But at that point the local variable ceases to exist, and you have a dangling pointer, one that used to point to something, but doesn't anymore. If the caller uses that pointer then you have Undefined Behavior.
Next, this loop,
for(int a = 0; a < (*listNodes).size(); a ++){
std::list<int*>::iterator i = (*listNodes).begin();
advance(i, a);
… will work, but it needlessly has O(n2) complexity, i.e. execution time.
Just iterate with the iterator. That's what iterators are for. Iterating.
Summing up, the garbage you see is due to the undefined behavior.
Just, don't do that.

How can I make my dynamic array or vector operate at a similar speed to a standard array? C++

I'm still quite inexperienced in C++ and i'm trying to write sum code to add numbers precisely. This is a dll plugin for some finite difference software and the code is called several million times during a run. I want to write a function where any number of arguments can be passed in and the sum will be returned. My code looks like:
#include <cstdarg>
double SumFunction(int numArgs, ...){ // this allows me to pass any number
// of arguments to my function.
va_list args;
va_start(args,numArgs); //necessary prerequisites for using cstdarg
double myarray[10];
for (int i = 0; i < numArgs; i++) {
myarray[i] = va_arg(args,double);
} // I imagine this is sloppy code; however i cannot create
// myarray{numArgs] because numArgs is not a const int.
sum(myarray); // The actual method of addition is not relevant here, but
//for more complicated methods, I need to put the summation
// terms in a list.
vector<double> vec(numArgs); // instead, place all values in a vector
for (int i = 0; i < numArgs; i++) {
vec.at(i) = va_arg(args,double);
}
sum(vec); //This would be passed by reference, of course. The function sum
// doesn't actually exist, it would all be contained within the
// current function. This is method is twice as slow as placing
//all the values in the static array.
double *vec;
vec = new double[numArgs];
for (int i = 0; i < (numArgs); i++) {
vec[i] = va_arg(args,double);
}
sum(vec); // Again half of the speed of using a standard array and
// increasing in magnitude for every extra dynamic array!
delete[] vec;
va_end(args);
}
So the problem I have is that using an oversized static array is sloppy programming, but using either a vector or a dynamic array slows the program down considerably. So I really don't know what to do. Can anyone help, please?
One way to speed the code up (at the cost of making it more complicated) is to reuse a dynamic array or vector between calls, then you will avoid incurring the overhead of memory allocation and deallocation each time you call the function.
For example declare these variables outside your function either as global variables or as member variables inside some class. I'll just make them globals for ease of explanation:
double* sumArray = NULL;
int sumArraySize = 0;
In your SumFunction, check if the array exists and if not allocate it, and resize if necessary:
double SumFunction(int numArgs, ...){ // this allows me to pass any number
// of arguments to my function.
va_list args;
va_start(args,numArgs); //necessary prerequisites for using cstdarg
// if the array has already been allocated, check if it is large enough and delete if not:
if((sumArray != NULL) && (numArgs > sumArraySize))
{
delete[] sumArray;
sumArray = NULL;
}
// allocate the array, but only if necessary:
if(sumArray == NULL)
{
sumArray = new double[numArgs];
sumArraySize = numArgs;
}
double *vec = sumArray; // set to your array, reusable between calls
for (int i = 0; i < (numArgs); i++) {
vec[i] = va_arg(args,double);
}
sum(vec, numArgs); // you will need to pass the array size
va_end(args);
// note no array deallocation
}
The catch is that you need to remember to deallocate the array at some point by calling a function similar to this (like I said, you pay for speed with extra complexity):
void freeSumArray()
{
if(sumArray != NULL)
{
delete[] sumArray;
sumArray = NULL;
sumArraySize = 0;
}
}
You can take a similar (and simpler/cleaner) approach with a vector, allocate it the first time if it doesn't already exist, or call resize() on it with numArgs if it does.
When using a std::vector the optimizer must consider that relocation is possible and this introduces an extra indirection.
In other words the code for
v[index] += value;
where v is for example a std::vector<int> is expanded to
int *p = v._begin + index;
*p += value;
i.e. from vector you need first to get the field _begin (that contains where the content starts in memory), then apply the index, and then dereference to get the value and mutate it.
If the code performing the computation on the elements of the vector in a loop calls any unknown non-inlined code, the optimizer is forced to assume that unknown code may mutate the _begin field of the vector and this will require doing the two-steps indirection for each element.
(NOTE: that the vector is passed with a cost std::vector<T>& reference is totally irrelevant: a const reference doesn't mean that the vector is const but simply puts a limitation on what operations are permitted using that reference; external code could have a non-const reference to access the vector and constness can also be legally casted away... constness of references is basically ignored by the optimizer).
One way to remove this extra lookup (if you know that the vector is not being resized during the computation) is to cache this address in a local and use that instead of the vector operator [] to access the element:
int *p = &v[0];
for (int i=0,n=v.size(); i<n; i++) {
/// use p[i] instead of v[i]
}
This will generate code that is almost as efficient as a static array because, given that the address of p is not published, nothing in the body of the loop can change it and the value p can be assumed constant (something that cannot be done for v._begin as the optimizer cannot know if someone else knows the address of _begin).
I'm saying "almost" because a static array only requires indexing, while using a dynamically allocated area requires "base + indexing" access; most CPUs however provide this kind of memory access at no extra cost. Moreover if you're processing elements in sequence the indexing addressing becomes just a sequential memory access but only if you can assume the start address constant (i.e. not in the case of std::vector<T>::operator[]).
Assuming that the "max storage ever needed" is in the order of 10-50, I'd say using a local array is perfectly fine.
Using vector<T> will use 3 * sizeof(*T) (at least) to track the contents of the vector. So if we compare that to an array of double arr[10];, then that's 7 elements more on the stack of equal size (or 8.5 in 32-bit build). But you also need a call to new, which takes a size argument. So that takes up AT LEAST one, more likely 2-3 elements of stackspace, and the implementation of new is quite possibly not straightforward, so further calls are needed, which take up further stack-space.
If you "don't know" the number of elements, and need to cope with quite large numbers of elements, then using a hybrid solution, where you have a small stack-based local array, and if numargs > small_size use vector, and then pass vec.data() to the function sum.

Object not added to array

I'm doing my homework (and learn how C++ works).
My task is:
Define some class with field...(never mind)
create an vector and array from these object and iterate it! (listing, average by field,etc).
Now it's correctly works with vector, but array doesnot work:
static Cipo* cipok; // object array
static int cep = 0; // endpoint index
static int ccap = 0; // array size
Default assignmet opearator for Cipo:
public: Cipo& operator=(const Cipo &c)
{
return ((Cipo&)c);
}
Initalization:
cipok = (Cipo*) malloc(sizeof(Cipo*)*100); // new Cipo[num] doesn't work..
ccap = 100;
Test code:
for (int i = 0; i < 5; i++)
{
Cipo c(43.5, "str", 12670, false, false);
std::cout << c.ar <<" ";
cipok[cep] = c;
std::cout << cipok[cep].ar << " ";
cep++;
}
And the result:
12670 0 12670 0 12670 0 12670 0 12670 0
But objects not "disappeared" if I use vector, push_back() the objects and read from the vector with direct indexing (or with iterators). Why do they exhibit this behaviour?
You immediate problem is likely caused by whacky implementation of operator = that does absolutely nothing. I'd recommend step through the code in debugger to see it. operator = (and copy constructor) should properly copy values into destination object.
There are many other issues with the code - your naming convention is ... interesting, you seem to try to cast whatever you have to whatever result is required for code to compile without reasoning what should actually be done. malloc in C++ code is very rarely needed...
I think the general problem is, I'm programming always in java (but now in university i must prog. in C/C++, naming conventions, like in java and in hungarian the Cipő is meaning Shoe). And in Java there is no pointers, and all object always acces by reference, but looks like (as i tested ) if i create a new object array the C++ will not allocate only 100 pointer which points to the object (where the object data starts), it allocated 100*sizeof(object) and for this place i can add data trougth assing operator.
it's my teory true?
So i tried to manage Object acces like in java.
Why copy the Object if itself alredy exist? (I don't like to "clone" objects).

Copying values from one vector to another (from book)

Consider this piece of code.
#include <iostream>
#include <vector>
using namespace std;
int main()
{
vector <int *> test;
vector <int *> v;
int *a = new int;
int *b = new int;
*a = 1;
*b = 2;
v.push_back (a);
v.push_back (b);
for (int i = 0; i < 2; ++i)
{
int n = *v[i];
test.push_back (&n);
}
cout << *test[0] << " " << *test[1] << endl;
delete a;
delete b;
return 0;
}
The problem's statement is:
"Given this code, answer the following questions:
Why does "test" vector contain only 2's?
How can we change for loop to copy properly (only code inside for loop)?"
I couldn't answer any of these questions, so a little bit of help will be appreciated.
Thanks in advance.
That code introduces dangling pointers. The body of the loop looks like this:
{
int n = *v[i];
test.push_back (&n);
}
The local variable n loses scope as soon as the loop body ends, so the pointer &n is now a dangling pointer. If it happens that test contains only 2's, that's just what randomly came out of what is undefined behavior.
If you want to "properly" copy the data over to test, you can change the for loop body to this:
{
int* n = new int;
*n = *v[i];
test.push_back (n);
}
Please take the "properly" with a grain of salt...
You push two the same pointers to n into test array. n equals the last element of your first array. Note that after control flow exited the loop, all pointers to n become invalid. So, in fact your test array contains invalid pointers, not pointers to 2s.
You should create a copy of each integer:
int* n = new int(*v[i]);
test.push_back (n);
Note also that you have memory leak here. Each int created using new should be later destroyed using delete.
The first question is a trick question: The vector contains pointers to a variable that no longer exists, and dereferencing that could cause pretty much any output. I imagine on some machines and compilers it prints all 2s however.
I can't understand what the exercise is trying to do (why does it use vectors of pointers for example) so I can't really help with how to solve the problem.
One way you could do it is by making test store by value:
First change the test vector to vector <int> test;
Then change the push_back to something like test.push_back (n); and finally the print statements to remove the now-unneeded * operators.
EDIT for comment:
First, I'm suspect of this book: It shouldn't be demonstrating undefined behavior or raw pointers to single builtin types. But you can change your loop body if you want:
for (int i = 0; i < 2; ++i)
{
int* n = new int;
*n = *v[i];
test.push_back (&n);
}
Note that both this will cause a memory leak unless you later delete those pointers, a problem that storing by value eliminates.
1) I think that the premise of the question is faulty. The loop adds two elements to test, each contains the address of the automatic variable n, the scope of which is limited to the body of the loop. It's not guaranteed that n will be allocated the same memory location in both passes through the loop, but I suppose that it's likely that most compilers will reuse the same location in both passes.
Moreover, n is out of scope at the output statement. So referencing the pointers in test to those memory locations is undefined. Again, there's a good chance that they will still contain the values assigned in the loop.
So, only if the same location gets reused for n in the second pass of the loop and that location has not been overwritten at the time the output statement is executed, will the output be "2 2". There is no guarantee of either of these premises.
2) To get the output "1 2" without changing anything outside the loop, one could change the definition of n to int& n = *v[i], which would be a single character change from the given code, though the end result is rather strange.
A simpler solution would be to eliminate the temporary n and simply test.push_back(v[i]).

How to copy a structure with pointers to data inside?

I am using the CUDD package for BDDs manipulation.
I want to make a copy for a big data structure in it that is called the DdManager.
The problem is : this data structure has so many pointers inside it , so when I make a direct copy it is a "shallow" copy(as some use the term) i.e. : the pointers in the new copy point to the same places pointed to by the original copy , so when I change in anyone of them I also change in the other which is undesirable ....
Trying to make a copy function by hand is not feasible because the data structure is really big and very detailed with many pointer to other complex structures also !!!
I have tried the vector solutions described here but I did not get the expected result because there are many nested structures and pointers and I want a completely new copy.
Here is a code sample of what I want to do :
#include <iostream>
#include <cstdlib>
#include <string.h>
#include <vector>
using namespace std;
struct n1
{
int a;
char *b;
};
struct n2
{
int **c;
struct n1 *xyz;
};
typedef struct
{
vector<struct n2> x;
}X;
int main()
{
struct n2 s1;
s1.xyz = (struct n1*)malloc(sizeof(struct n1));
s1.xyz->a = 3;
s1.xyz->b = (char*)malloc(5);
s1.xyz->b[0] = '\0';
strcat(s1.xyz->b,"Mina");
s1.c = (int**)malloc(5 * sizeof(int*));
for(int i = 0; i < 5; i++)
s1.c[i] = (int*)malloc(5 * sizeof(int));
for(int i = 0; i < 5; i++)
for(int j = 0 ; j < 5 ; j++)
s1.c[i][j] = i + j;
X struct1,struct2;
vector<struct n2>::iterator it;
it = struct1.x.begin();
it = struct1.x.insert(it,s1);
it = struct2.x.begin();
it = struct2.x.insert(it,struct1.x[0]);
cout<<"struct2.x[0].c[1][2] = "<<struct2.x[0].c[1][2] <<" !"<<endl; // This is equal to 3
(struct2.x[0].c[1][2])++; //Now it becomes 4
cout<<"struct2.x[0].c[1][2] = "<<struct2.x[0].c[2][2] <<" !"<<endl; //This will print 4
cout<<"s1.c[1][2] "<< s1.c[1][2]<<" !"<<endl; // This will also print 4 ... that's the wrong thing
return 0;
}
Despite other saying that you have to
make a copy function by hand
...to solve this, I think that's the wrong approach for you. Here's why, and here's a suggestion.
You're trying to create a copy of a CUDD ddManager object, which is an integral part of the complex CUDD library. CUDD internally uses reference counts for some objects (which might help you here...) but the ddManager object effectively represents an entire instance of the library, and I've no ideas how the reference counts would work across instances.
The CUDD library and it's associated C++ wrapper doesn't seem to provide the necessary copy constructors for creating separate copies of the ddManager, and to add these would probably involve serious effort, and detailed internal knowledge of a library that you are just trying to use as a client. While it's possible to do this, it's complex thing to do.
Instead, I'd look at trying to write out the current BDD to a file/stream/whatever, and then read it back into a new instance of a ddManager. There's a library called dddmp that should help you with this.
I'd also recommend that the C++ wrapper was modified to make the ddManager class non-copyable.
"Trying to make a copy function by hand is not feasible because the data structure is really big and very detailed with many pointer to other complex structures also !!! "
This is exactly what you have to do.
The objective approach means that you don't write one big do-it-all copy method. Instead, every object (structure) copies only itself, and then call it's sub-object copy methods, etc, etc until there is nothing more left to copy.
There is no such thing as "vector solution", vector is simply the smallest object with it's smallest copy method.
There is no difference between struct and class, so just write them copy methods.
Only you know the structure of your data, so you're the only One who can save the humankind (or copy this data).