I have a class "foo" that has a multi dimensional array and need to provide a copy of the array through a getArray member. Is there a nice way of doing this when the array is dynamically created so I can not pass the array back a const as the array is always being deleted, recreated etc. I thought about creating a new dynamic array to pass it back but is this acceptable as the calling code would need to know to delete this etc.
Return an object, not a naked array. The object can have a copy constructor, destructor etc. which will do the copying, deletion etc. for the user.
class Matrix {
// handle creation and access to your multidim array
// including copying, deletion etc.
};
class A { // your class
Matrix m; // the classes matrix
Matrix getArray() {
return m;
}
};
Easy answer to your question is, not this is not a good design, as it should be the creating class that should handle the deletion/release of the array.
The main point is why do you keep deleting/recreating this multi dimensional array? Can you not create one instance, and then just modify when need be?
Personally I would return the array as it is, and iterate over it and do any calculations/functions on it during the loop therefore saving resources by not creating/deleting the array.
Neil's probably the best answer. The second best will be not to use an array. In C++, when you talk about dynamic array, it means vector.
There are two possibilities:
nested vectors: std::vector<int, std::vector<int> >(10, std::vector<int>(20))
simple vector: std::vector<int>(200)
Both will have 200 items. The first is clearly multi-dimensional, while the second leaves you the task of computing offsets.
The second ask for more work but is more performing memory-wise since a single big chunk is allocated instead of one small chunks pointing to ten medium ones...
But as Neil said, a proper class of your own to define the exact set of operations is better :)
Related
i have a class in which it's protected section i need to declare an array with unknown size (the size is given to the constructor as a parameter), so i looked around and found out that the best possible solution is to declare an array of pointers, each element points to an integer:
int* some_array_;
and simply in the constructor i'll use the "new" operator:
some_array_ = new int[size];
and it worked, my question is: can i declare an array in a class without defining the size? and if yes how do i do it, if not then why does it work for pointers and not for a normal array?
EDIT: i know vecotrs will solve the problem but i can't use them on my HW
You have to think about how this works from the compiler's perspective. A pointer uses a specific amount of space (usually 4 bytes) and you request more space with the new operator. But how much space does an empty array use? It can't be 0 bytes and the compiler has no way of knowing what space to allocate for an array without any elements and therefore it is not allowed.
You could always use a vector. To do this, add this line of code: #include <vector> at the top of your code, and then define the vector as follows:
vector<int> vectorName;
Keep in mind that vectors are not arrays and should not be treated as such. For example, in a loop, you would want to retrieve an element of a vector like this: vectorName.at(index) and not like this: vectorName[index]
Lets say that you have an integer array of size 2. So you have Array[0,1]
Arrays are continuous byte of memery, so if you declare one and then you want to add one or more elements to end of that array, the exact next position (in this case :at index 2(or the 3rd integer) ) has a high chance of being already allocated so in that case you just cant do it. A solution is to create a new array (in this case of 3 elements) , copy the initial array into the new and in the last position add the new integer. Obviously this has a high cost so we dont do it.
A solution to this problem in C++ is Vector and in Java are ArrayLists.
I want to create a dynamic array of a specific object that would also support adding new objects to the array.
I'm trying to solve this as part of an exercise in my course. In this exercise we are not supposed to use std::vector.
For example, let's say I have a class named Product and declare a pointer:
Products* products;
then I want to support the following:
products = new Product();
/* code here... */
products[1] = new Product(); // and so on...
I know the current syntax could lead to access violation. I don't know the size of the array in advance, as it can change throughout the program.
The questions are:
How can I write it without vectors?
Do I have to use double pointers (2-dimension)?
Every time I want to add a new object, do I have to copy the array to the new array (with +1 size), and then delete the array?
You should not write this without std::vector. If you for some reason need to, your copying with every resize is by far the easiest option.
I do not see how that would help. (I.e. no)
As mentioned above, this is by far the easiest method if you cannot use std::vector. Everything else would be (partially) reinventing one standard library container or the other, which is hard.
You have to use your own memory memory management, i.e. more specifically wrt your other (related) questions:
No, if you have a contiguous allocated chunk of memory where your data lives in.
Yes, if 2. is your desired implementation method. However, if you don't want to use a large memory chunk, you have to use a (double) linked list which does not require you to copy the whole array every time.
I see people already answered your specific questions, so I'll answer a more general answer.
You must implement a way to do it by yourself, but there are lots of Abstract Data Types that you can use, as far as I can see the simplest would be a linked list, such as the following:
class ProductNode
{
public:
ProductNode() : _data(NULL), _next(NULL)
{
}
void setProduct(Product* p); //setter for the product pointer
{
this->_data = p;
}
Product getProduct(); //getter for the product pointer
{
return *(this->_data);
}
void addNext(); //allocate memory for another ProductNode in '_next'
{
if(!next)
{
this->_next = new ProductNode();
}
}
ProductNode* getNext(); //get the allocated memory, the address that is in '_next'
{
return this->_next;
}
~ProductNode(); //delete every single node from that node and forward, it'll be recursive for a linked list
private:
Product* _data;
ProductNode* _next;
}
Declare a head variable and go from there.
Of course that most of the functions here should be implemented otherwise, it was coded quickly so you could see the basics that you need for this assignment.
That's one way.
Also you can make your own data type.
Or use some others data types for abstraction of the data.
What you probably should do (i.e. what I believe you're expected to do) is write your own class that represents a dynamic array (i.e. you're going to reinvent parts of std::vector.)
Despite what many around here say, this is a worthwhile exercise and should be part of a normal computer science curriculum.
Use a dynamically allocated array which is a member of your class.
If you're using a dynamically allocated array of Product*, you'll be storing a Product**, so yes, in a way. It's not necessary to have "double pointers" for the functionality itself, though.
Technically no - you can allocate more than necessary and only reallocate and copy when you run out of space. This is what vector does to improve its time complexity.
Expanding for each element is a good way to start though, and you can always change your strategy later.
(Get the simple way working first, get fancy if you need to.)
It's probably easiest to first implement an array of int (or some other basic numerical type) and make sure that it works, and then change the type of the contents.
I suppose by "Products *products;" you mean "Products" is a vector-like container.
1) How can I write it without vectors?
As a linked list. Instantiating a "Products products" will give you an empty linked list.
Overriding the operator[] will insert/replace the element in the list. So you need to scan the list to find the right place. If several elements are missing until you got the right place, you may need to append those "neutral" elements before your element. Doing so, through "Product *products" is not feasible if you plan to override the operator[] to handle addition of elements, unless you declare "Products products" instead
2) Do I have to use double pointers (2-dimension)?
This question lacks of precision. As in "typedef Product *Products;" then "Products *products" ? as long as you maintained a " * " between "Products" and "products", there is no way to override operator[] to handle addition of element.
3) Every time I want to add a new object, do I have to copy the array to the new array (with +1 size), and then delete the array?
If you stick with array, you can use a O(log2(n)) time reallocation, by simply growing twice the array size (and supposedly you have a zero-terminal or a count embedded). Or just use a linked list instead to avoid any copy of all elements before adding an element.
I have a large array of objects (10s Millions) of class A and I want to add a vector as a member to the class A. This vector is needed just for few percent of objects in the array. I was wondering, would it be a wise choice to add a vector to the class? how much memory will take an empty vector?
Now we know that empty vector are not very “big” (VC2012x64 intellisense show sizeof(std::vector<int>) is 16 byte). If sizeof(A) is much bigger than size-of vector adding a vector member to A could be a good solution for you. But if it is not good, and will add to much memory, and really not many A has the vector, I’d create a second container with the vectors. For example:
#include <unordered_map>
unordered_map<size_t , vector<T>> VectorForA;
where size_t is mean to be the type of the index of the big array of A, and vector<T> the type of the vector you want to add to A. This could be good for a fixed index "big" array. If somehow the A's in the big array dont have fixed positions, making the value of A the key could led to a simpler code(also, only if the values of A dont repeat).
NOTE: I was (I'm) wating to see a full answer from #Andy Prowl or #Tony D with I think will be very usefull
i have a doubt on following case;
suppose i want to define a vector of vector to acomadate set of elements and i can add the data and can be used those elemnts to compute something else. then i dont want that vector anymore. then later, suppose if i want to accomadae another set of data as a vector of vector, then i can reuse the previously created variable, then;
(1) if I created my vector of vector as dynamic memory and deleted as
vector<vector<double> > *myvector = new vector<vecctor<double> >;
//do push back and use it
delete myvector;
and then reuse again
(2) if I created my vector of vector as simply
vector<vector<double> > myvector;
//do push back and use it
myvector.clear();
and then reuse again
but, i guess in both method very few memory is remaining though we have removed it. so, i would like to know what is the efficient method to defin this vector of vector.
(3) if the size of the inside vector is always 2, then is it still efficient to use
vector of vector by defining
vector<vector<double> > myvector(my_list.size(), vector<double>(2)) other than another container type
(4) if i use predefined another class to hold inside 2 elements and then take a vector of those object type as (for example XY is the class which can hold 2 elements, may be as an array)
vector<XY>;
i hope, please anyone comment me what would be the most efficient method (from 1-4) interms of speed and memory needed. is there any better ways, plz suggest me too. thanks
If you only need two elements the most efficient way is probably:
std::vector<std::tr1::array<double, 2>> myvector;
// or std::vector<std::pair<double, double>> myvector;
// use it
myvector.clear(); // This will not deallocate any memory, so what has alrdy been allocated will be used for future push_backs
If this is homework (or not) and you want to know which is faster then you should try it: run the "push back and clear" or "new and delete" in a loop a few million times and time the execution. I suspect 2. will be a bit faster.
You say you want to accommodate a set of elements, and that the size of the inner vector is 2.
That is a bit ambiguous, but you can further examine 2 things:
1. std::pair, if your "element" contains 2 other "things"
2. std::map, if you want to reference of of the "elements" based on the value of another "element"
This is my little big question about containers, in particular, arrays.
I am writing a physics code that mainly manipulates a big (> 1 000 000) set of "particles" (with 6 double coordinates each). I am looking for the best way (in term of performance) to implement a class that will contain a container for these data and that will provide manipulation primitives for these data (e.g. instantiation, operator[], etc.).
There are a few restrictions on how this set is used:
its size is read from a configuration file and won't change during execution
it can be viewed as a big two dimensional array of N (e.g. 1 000 000) lines and 6 columns (each one storing the coordinate in one dimension)
the array is manipulated in a big loop, each "particle / line" is accessed and computation takes place with its coordinates, and the results are stored back for this particle, and so on for each particle, and so on for each iteration of the big loop.
no new elements are added or deleted during the execution
First conclusion, as the access on the elements is essentially done by accessing each element one by one with [], I think that I should use a normal dynamic array.
I have explored a few things, and I would like to have your opinion on the one that can give me the best performances.
As I understand there is no advantage to use a dynamically allocated array instead of a std::vector, so things like double** array2d = new ..., loop of new, etc are ruled out.
So is it a good idea to use std::vector<double> ?
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > my_array that can be indexed like my_array[i][j], or is it a bad idea and it would be better to use std::vector<double> other_array and acces it with other_array[6*i+j].
Maybe this can gives better performance, especially as the number of columns is fixed and known from the beginning.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
Another option, the one that I am using so far is to use Blitz, in particular blitz::Array:
typedef blitz::Array<double,TWO_DIMENSIONS> store_t;
store_t my_store;
Where my elements are accessed like that: my_store(line, column);.
I think there are not much advantage to use Blitz in my case because I am accessing each element one by one and that Blitz would be interesting if I was using operations directly on array (like matrix multiplication) which I am not.
Do you think that Blitz is OK, or is it useless in my case ?
These are the possibilities I have considered so far, but maybe the best one I still another one, so don't hesitate to suggest me other things.
Thanks a lot for your help on this problem !
Edit:
From the very interesting answers and comments bellow a good solution seems to be the following:
Use a structure particle (containing 6 doubles) or a static array of 6 doubles (this avoid the use of two dimensional dynamic arrays)
Use a vector or a deque of this particle structure or array. It is then good to traverse them with iterators, and that will allow to change from one to another later.
In addition I can also use a Blitz::TinyVector<double,6> instead of a structure.
So is it a good idea to use std::vector<double> ?
Usually, a std::vector should be the first choice of container. You could use either std::vector<>::reserve() or std::vector<>::resize() to avoid reallocations while populating the vector. Whether any other container is better can be found by measuring. And only by measuring. But first measure whether anything the container is involved in (populating, accessing elements) is worth optimizing at all.
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > [...]?
No. IIUC, you are accessing your data per particle, not per row. If that's the case, why not use a std::vector<particle>, where particle is a struct holding six values? And even if I understood incorrectly, you should rather write a two-dimensional wrapper around a one-dimensional container. Then align your data either in rows or columns - what ever is faster with your access patterns.
Do you think that Blitz is OK, or is it useless in my case?
I have no practical knowledge about blitz++ and the areas it is used in. But isn't blitz++ all about expression templates to unroll loop operations and optimizing away temporaries when doing matrix manipulations? ICBWT.
First of all, you don't want to scatter the coordinates of one given particle all over the place, so I would begin by writing a simple struct:
struct Particle { /* coords */ };
Then we can make a simple one dimensional array of these Particles.
I would probably use a deque, because that's the default container, but you may wish to try a vector, it's just that 1.000.000 of particles means about a single chunk of a few MBs. It should hold but it might strain your system if this ever grows, while the deque will allocate several chunks.
WARNING:
As Alexandre C remarked, if you go the deque road, refrain from using operator[] and prefer to use iteration style. If you really need random access and it's performance sensitive, the vector should prove faster.
The first rule when choosing from containers is to use std::vector. Then, only after your code is complete and you can actually measure performance, you can try other containers. But stick to vector first. (And use reserve() from the start)
Then, you shouldn't use an std::vector<std::vector<double> >. You know the size of your data: it's 6 doubles. No need for it to be dynamic. It is constant and fixed. You can define a struct to hold you particle members (the six doubles), or you can simply typedef it: typedef double particle[6]. Then, use a vector of particles: std::vector<particle>.
Furthermore, as your program uses the particle data contained in the vector sequentially, you will take advantage of the modern CPU cache read-ahead feature at its best performance.
You could go several ways. But in your case, don't declare astd::vector<std::vector<double> >. You're allocating a vector (and you copy it around) for every 6 doubles. Thats way too costly.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
(other_array[i,j] won't work too well, as i,j employs the comma operator to evaluate the value of "i", then discards that and evaluates and returns "j", so it's equivalent to other_array[i]).
You will need to use one of:
other_array[i][j]
other_array(i, j) // if other_array implements operator()(int, int),
// but std::vector<> et al don't.
other_array[i].identifier // identifier is a member variable
other_array[i].identifier() // member function getting value
other_array[i].identifier(double) // member function setting value
You may or may not prefer to put get_ and set_ or similar on the last two functions should you find them useful, but from your question I think you won't: functions are prefered in APIs between parts of large systems involving many developers, or when the data items may vary and you want the algorithms working on the data to be independent thereof.
So, a good test: if you find yourself writing code like other_array[i][3] where you've decided "3" is the double with the speed in it, and other_array[i][5] because "5" is the the acceleration, then stop doing that and give them proper identifiers so you can say other_array[i].speed and .acceleration. Then other developers can read and understand it, and you're much less likely to make accidental mistakes. On the other hand, if you are iterating over those 6 elements doing exactly the same things to each, then you probably do want Particle to hold a double[6], or to provide an operator[](int). There's no problem doing both:
struct Particle
{
double x[6];
double& speed() { return x[3]; }
double speed() const { return x[3]; }
double& acceleration() { return x[5]; }
...
};
BTW / the reason that vector<vector<double> > may be too costly is that each set of 6 doubles will be allocated on the heap, and for fast allocation and deallocation many heap implementations use fixed-size buckets, so your small request will be rounded up t the next size: that may be a significant overhead. The outside vector will also need to record a extra pointer to that memory. Further, heap allocation and deallocation is relatively slow - in you're case, you'd only be doing it at startup and shutdown, but there's no particular point in making your program slower for no reason. Even more importantly, the areas on the heap may just around in memory, so your operator[] may have cache-faults pulling in more distinct memory pages than necessary, slowing the entire program. Put another way, vectors store elements contiguously, but the pointed-to-vectors may not be contiguous.