I need to create a dynamic array of node objects in a function that performs some logic I will not delve into.
This function will be called multiple times consecutively for a different number of nodes (num. of nodes increments until it surpasses a million nodes).
At first I was initialising the array the following way:
node** heaps = new node*[arraySize];
for (int i=0; i < arraySize; i++)
heaps[i] = nullptr;
However, since this function is called a large number of times, the for loop will slow down my program (I need the function to be in the order of logn, but this for loop at the beginning already makes it in the order of n).
I then saw another way of initializing a dynamic array as below:
node** heaps = new node*[arraySize]();
My program seems to work the same with just the above line, however, I'm not really sure what the difference between both methods is, and whether it really improves performance (as I can't identify a big performance difference).
Can anyone explain?
The extra "()" you are asking about is called an initializer. It is optional unless the type you provide is auto, in which case the type will be deduced from the initializer you provide.
In your first example your node pointers are default-initialized, which means their values are indeterminate. They could point at anything.
In your second example, your node pointers are value-initialized, which means they are all null pointers.
The end result is the same either way. The second example relies on your implementation to provide zero-initialized pointers for you, but I'd be surprised if it weren't as quick or faster than your for loop. When it comes to performance, always measure.
http://en.cppreference.com/w/cpp/language/new
Using () will default initialize value initialize the array. There would be minimal performance benefit over the loop, because it does the same job. However, if there is a performance hit because of that, then you will have to look elsewhere in your code/algorithm to find out if there is anything that can be moved to a not so warm code path.
If you have to use a dynamic array, use std::vector<node*> or std::array<node*,N> if you know N at compile time.
Or even better, use modern C++ facilities:
using node_ptr = std::unique_ptr<node>;
std::vector<node_ptr> heaps;
Related
I need to create a multidimensional matrix of randomly distributed numbers using a Gaussian distribution, and am trying to keep the program as optimized as possible. Currently I am using Boost matrices, but I can't seem to find anything that accomplishes this without manually looping. Ideally, I would like something similar to Python's numpy.random.randn() function, but this must be done in C++. Is there another way to accomplish this that is faster than manually looping?
You're going to have to loop anyway, but you can eliminate the array lookup inside your loop. True N-dimensional array indexing is going to be expensive, so you best option is any library (or written yourself) which also provides you with an underlying linear data store.
You can then loop over the entire n-dimensional array as if it was linear, avoiding many multiplications of the indexes by the dimensions.
Another optimization is to do away with the index altogether, and take a pointer to the first element, then iterate the pointer itself, this does away with a whole variable in the CPU which can give the compiler more space for other things. e.g. if you had 1000 elements in a vector:
vector<int> data;
data.resize(1000);
int *intPtr = &data[0];
int *endPtr = &data[0] + 1000;
while(intPtr != endPtr)
{
(*intPtr) == rand_function();
++intPtr;
}
Here, two tricks have happened. Pre-calculate the end condition outside the loop itself (this avoids a lookup of a function such as vector::size() 1000 times), and working with pointers to the data in memory rather than indexes. An index gets internally converted to a pointer every time it's used to access the array. By storing the "current pointer" and adding 1 to that each time, then the cost of calculating the pointers from indexes 1000 times is eliminated.
This can be faster but it depends on the implementation. Compilers can do some of the same hand-optimizations, but not all of them. The rand_function should also be inline to avoid the function call overhead.
A warning however: if you use std::vector with the pointer trick then it's not thread safe, if another thread changed the vector's length during the loop then the vector can get reallocated to a different place in memory. Don't do pointer tricks unless you'd be perfectly comfortable writing your own vector, array, table classes as needed.
This question already has answers here:
How to implement a double linked list with only one pointer?
(6 answers)
Closed 7 years ago.
I want to use a structure like:
struct node {
char[10] tag;
struct node *next;
};
I want to use the above structure to create a doubly-linked list. Is that possible and if yes, then how I can achieve it?
Yes, it's possible, but it's a dirty hack.
It's called XOR linked list. (https://en.wikipedia.org/wiki/XOR_linked_list)
Each node stores a XOR of next and prev as a uintptr_t.
Here is an example:
#include <cstddef>
#include <iostream>
struct Node
{
int num;
uintptr_t ptr;
};
int main()
{
Node *arr[4];
// Here we create a new list.
int num = 0;
for (auto &it : arr)
{
it = new Node;
it->num = ++num;
}
arr[0]->ptr = (uintptr_t)arr[1];
arr[1]->ptr = (uintptr_t)arr[0] ^ (uintptr_t)arr[2];
arr[2]->ptr = (uintptr_t)arr[1] ^ (uintptr_t)arr[3];
arr[3]->ptr = (uintptr_t)arr[2];
// And here we iterate over it
Node *cur = arr[0], *prev = 0;
do
{
std::cout << cur->num << ' ';
prev = (Node *)(cur->ptr ^ (uintptr_t)prev);
std::swap(cur, prev);
}
while (cur);
return 0;
}
It prints 1 2 3 4 as expected.
I'd like to offer an alternative answer which boils down to "yes and no".
First, it's "sort of impossible" if you want to get the full benefits of a doubly-linked list with only one single pointer per node.
XOR List
Yet cited here was also the XOR linked list. It retains one main benefit by having a lossy compression of two pointers fitting into one that you lose with a singly-linked list: the ability to traverse it in reverse. It cannot do things like remove elements from the middle of the list in constant-time given only the node address, and being able to go back to a previous element in a forward iteration and removing an arbitrary element in linear time is even simpler without the XOR list (you likewise keep two node pointers there: previous and current).
Performance
Yet also cited in the comments was a desire for performance. Given that, I think there are some practical alternatives.
First, a next/prev pointer in a doubly-linked list doesn't have to be, say, a 64-bit pointer on 64-bit systems. It can be two indices into a 32-bit contiguous address space. Now you got two indices for the memory price of one pointer. Nevertheless, trying to emulate 32-bit addressing on 64-bit is quite involved, maybe not exactly what you want.
However, to reap the full performance benefits of a linked structure (trees included) often requires you to get back control over how the nodes are allocated and distributed in memory. Linked structures tend to be bottlenecky because, if you just use malloc or plain operator new for every node, e.g., you lose control over memory layout. Often (not always -- you can get lucky depending on the memory allocator, and whether you allocate all your nodes at once or not) this means a loss of contiguity, which means a loss of spatial locality.
It's why data-oriented design stresses arrays more than anything else: linked structures are not normally very friendly for performance. The process of moving chunks from larger memory to smaller, faster memory likes it if you are going to access neighboring data within the same chunk (cache line/page, e.g.) prior to eviction.
The Not-So-Often-Cited Unrolled List
So there's a hybrid solution here which is not so often discussed, which is the unrolled list. Example:
struct Element
{
...
};
struct UnrolledNode
{
struct Element elements[32];
struct UnrolledNode* prev;
struct UnrolledNode* next;
};
An unrolled list combines the characteristics of arrays and doubly-linked lists all into one. It'll give you back a lot of spatial locality without having to look to the memory allocator.
It can traverse forwards and backwards, it can remove arbitrary elements from the middle at any given time for cheap.
And it reduces the linked list overhead to the absolute minimum: in this case I hard-coded an unrolled array size of 32 elements per node. That means the cost of storing the list pointers has shrunk to 1/32th of its normal size. That's even cheaper from a list pointer overhead standpoint than a singly-linked list, with often faster traversal (because of cache locality).
It's not a perfect replacement for a doubly-linked list. For a start, if you are worried about the invalidation of existing pointers to elements in the list on removal, then you have to start worrying about leaving vacant spaces (holes/tombstones) behind that get reclaimed (possibly by associating free bits in each unrolled node). At that point you're dealing with many similar concerns of implementing a memory allocator, including some minor forms of fragmentation (ex: having an unrolled node with 31 vacant spaces and only one element occupied -- the node still has to stay around in memory to avoid invalidation until it becomes completely empty).
An "iterator" to it which allows insertion/removal to/from the middle typically has to be larger than a pointer (unless, as noted in the comments, you store additional metadata with each element). It can waste memory (typically moot unless you have really teeny lists) by requiring, say, the memory for 32 elements even if you have a list of only 1 element. It does tend to be a little more complex to implement than any of these above solutions. But it's a very useful solution in a performance-critical scenario, and often one that probably deserves more attention. It's one that's not brought up so much in computer science since it doesn't do any better from an algorithmic perspective than a regular linked list, but locality of reference has a significant impact on performance as well in real-world scenarios.
It's not completely possible. A doubly linked list requires two pointers, one for the link in each direction.
Depending on what you need the XOR linked list may do what you need (see HolyBlackCat's answer).
Another option is to work around this limitation a little by doing things like remembering the last node you processed as you iterate through the list. This will let you go back one step during the processing but it doesn't make the list doubly linked.
You can declare and support two initial pointers to nodes head and tail. In this case you will be able to add nodes to the both ends of the list.
Such a list sometimes called a two-sided list.
However the list itself will be a forward list.
Using such a list you can for example simulate a queue.
It is not possible in a portable way without invoking undefined behavior:
Can an XOR linked list be implemented in C++ without causing undefined behavior?
I want to create a dynamic array of a specific object that would also support adding new objects to the array.
I'm trying to solve this as part of an exercise in my course. In this exercise we are not supposed to use std::vector.
For example, let's say I have a class named Product and declare a pointer:
Products* products;
then I want to support the following:
products = new Product();
/* code here... */
products[1] = new Product(); // and so on...
I know the current syntax could lead to access violation. I don't know the size of the array in advance, as it can change throughout the program.
The questions are:
How can I write it without vectors?
Do I have to use double pointers (2-dimension)?
Every time I want to add a new object, do I have to copy the array to the new array (with +1 size), and then delete the array?
You should not write this without std::vector. If you for some reason need to, your copying with every resize is by far the easiest option.
I do not see how that would help. (I.e. no)
As mentioned above, this is by far the easiest method if you cannot use std::vector. Everything else would be (partially) reinventing one standard library container or the other, which is hard.
You have to use your own memory memory management, i.e. more specifically wrt your other (related) questions:
No, if you have a contiguous allocated chunk of memory where your data lives in.
Yes, if 2. is your desired implementation method. However, if you don't want to use a large memory chunk, you have to use a (double) linked list which does not require you to copy the whole array every time.
I see people already answered your specific questions, so I'll answer a more general answer.
You must implement a way to do it by yourself, but there are lots of Abstract Data Types that you can use, as far as I can see the simplest would be a linked list, such as the following:
class ProductNode
{
public:
ProductNode() : _data(NULL), _next(NULL)
{
}
void setProduct(Product* p); //setter for the product pointer
{
this->_data = p;
}
Product getProduct(); //getter for the product pointer
{
return *(this->_data);
}
void addNext(); //allocate memory for another ProductNode in '_next'
{
if(!next)
{
this->_next = new ProductNode();
}
}
ProductNode* getNext(); //get the allocated memory, the address that is in '_next'
{
return this->_next;
}
~ProductNode(); //delete every single node from that node and forward, it'll be recursive for a linked list
private:
Product* _data;
ProductNode* _next;
}
Declare a head variable and go from there.
Of course that most of the functions here should be implemented otherwise, it was coded quickly so you could see the basics that you need for this assignment.
That's one way.
Also you can make your own data type.
Or use some others data types for abstraction of the data.
What you probably should do (i.e. what I believe you're expected to do) is write your own class that represents a dynamic array (i.e. you're going to reinvent parts of std::vector.)
Despite what many around here say, this is a worthwhile exercise and should be part of a normal computer science curriculum.
Use a dynamically allocated array which is a member of your class.
If you're using a dynamically allocated array of Product*, you'll be storing a Product**, so yes, in a way. It's not necessary to have "double pointers" for the functionality itself, though.
Technically no - you can allocate more than necessary and only reallocate and copy when you run out of space. This is what vector does to improve its time complexity.
Expanding for each element is a good way to start though, and you can always change your strategy later.
(Get the simple way working first, get fancy if you need to.)
It's probably easiest to first implement an array of int (or some other basic numerical type) and make sure that it works, and then change the type of the contents.
I suppose by "Products *products;" you mean "Products" is a vector-like container.
1) How can I write it without vectors?
As a linked list. Instantiating a "Products products" will give you an empty linked list.
Overriding the operator[] will insert/replace the element in the list. So you need to scan the list to find the right place. If several elements are missing until you got the right place, you may need to append those "neutral" elements before your element. Doing so, through "Product *products" is not feasible if you plan to override the operator[] to handle addition of elements, unless you declare "Products products" instead
2) Do I have to use double pointers (2-dimension)?
This question lacks of precision. As in "typedef Product *Products;" then "Products *products" ? as long as you maintained a " * " between "Products" and "products", there is no way to override operator[] to handle addition of element.
3) Every time I want to add a new object, do I have to copy the array to the new array (with +1 size), and then delete the array?
If you stick with array, you can use a O(log2(n)) time reallocation, by simply growing twice the array size (and supposedly you have a zero-terminal or a count embedded). Or just use a linked list instead to avoid any copy of all elements before adding an element.
Now I am writing some code for solving vehicle routing problems. To do so, one important decision is to choose how to encode the solutions. A solution contains several routes, one for each vehicle. Each route has a customer visiting sequence, the load of route, the length of route.
To perform modifications on a solution the information, I also need to quickly find some information.
For example,
Which route is a customer in?
What customers does a route have?
How many nodes are there in a route?
What nodes are in front of or behind a node?
Now, I am thinking to use the following structure to keep a solution.
struct Sol
{
vector<short> nextNode; // show what is the next node of each node;
vector<short> preNode; //show what is the preceding node
vector<short> startNode;
vector<short> rutNum;
vector<short> rutLoad;
vector<float> rutLength;
vector<short> rutSize;
};
The common size of each vector is instance dependent, between 200-2000.
I heard it is possible to use dynamic array to do this job. But it seems to me dynamic array is more complicated. One has to locate the memory and release the memory. Here my question is twofold.
How to use dynamic array to realize the same purpose? how to define the struct or class so that memory location and release can be easily taken care of?
Will using dynamic array be faster than using vector? Assuming the solution structure need to be accessed million times.
It is highly unlikely that you'll see an appreciable performance difference between a dynamic array and a vector since the latter is essentially a very thin wrapper around the former. Also bear in mind that using a vector would be significantly less error-prone.
It may, however, be the case that some information is better stored in a different type of container altogether, e.g. in an std::map. The following might be of interest: What are the complexity guarantees of the standard containers?
It is important to give some thought to the type of container that gets used. However, when it comes to micro-optimizations (such as vector vs dynamic array), the best policy is to profile the code first and only focus on parts of the code that prove to be real -- rather than assumed -- bottlenecks.
It's quite possible that vector's code is actually better and more performant than dynamic array code you would write yourself. Only if profiling shows significant time spent in vector would I consider writing my own error-prone replacement. See also Dynamically allocated arrays or std::vector
I'm using MSVC and the implementation looks to be as quick as it can be.
Accessing the array via operator [] is:
return (*(this->_Myfirst + _Pos));
Which is as quick as you are going to get with dynamic memory.
The only overhead you are going to get is in the memory use of a vector, it seems to create a pointer to the start of the vector, the end of the vector, and the end of the current sequence. This is only 2 more pointers than you would need if you were using a dynamic array. You are only creating 200-2000 of these, I doubt memory is going to be that tight.
I am sure the other stl implementations are very similar. I would absorb the minor cost of vector storage and use them in your project.
This is my little big question about containers, in particular, arrays.
I am writing a physics code that mainly manipulates a big (> 1 000 000) set of "particles" (with 6 double coordinates each). I am looking for the best way (in term of performance) to implement a class that will contain a container for these data and that will provide manipulation primitives for these data (e.g. instantiation, operator[], etc.).
There are a few restrictions on how this set is used:
its size is read from a configuration file and won't change during execution
it can be viewed as a big two dimensional array of N (e.g. 1 000 000) lines and 6 columns (each one storing the coordinate in one dimension)
the array is manipulated in a big loop, each "particle / line" is accessed and computation takes place with its coordinates, and the results are stored back for this particle, and so on for each particle, and so on for each iteration of the big loop.
no new elements are added or deleted during the execution
First conclusion, as the access on the elements is essentially done by accessing each element one by one with [], I think that I should use a normal dynamic array.
I have explored a few things, and I would like to have your opinion on the one that can give me the best performances.
As I understand there is no advantage to use a dynamically allocated array instead of a std::vector, so things like double** array2d = new ..., loop of new, etc are ruled out.
So is it a good idea to use std::vector<double> ?
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > my_array that can be indexed like my_array[i][j], or is it a bad idea and it would be better to use std::vector<double> other_array and acces it with other_array[6*i+j].
Maybe this can gives better performance, especially as the number of columns is fixed and known from the beginning.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
Another option, the one that I am using so far is to use Blitz, in particular blitz::Array:
typedef blitz::Array<double,TWO_DIMENSIONS> store_t;
store_t my_store;
Where my elements are accessed like that: my_store(line, column);.
I think there are not much advantage to use Blitz in my case because I am accessing each element one by one and that Blitz would be interesting if I was using operations directly on array (like matrix multiplication) which I am not.
Do you think that Blitz is OK, or is it useless in my case ?
These are the possibilities I have considered so far, but maybe the best one I still another one, so don't hesitate to suggest me other things.
Thanks a lot for your help on this problem !
Edit:
From the very interesting answers and comments bellow a good solution seems to be the following:
Use a structure particle (containing 6 doubles) or a static array of 6 doubles (this avoid the use of two dimensional dynamic arrays)
Use a vector or a deque of this particle structure or array. It is then good to traverse them with iterators, and that will allow to change from one to another later.
In addition I can also use a Blitz::TinyVector<double,6> instead of a structure.
So is it a good idea to use std::vector<double> ?
Usually, a std::vector should be the first choice of container. You could use either std::vector<>::reserve() or std::vector<>::resize() to avoid reallocations while populating the vector. Whether any other container is better can be found by measuring. And only by measuring. But first measure whether anything the container is involved in (populating, accessing elements) is worth optimizing at all.
If I use a std::vector, should I create a two dimensional array like std::vector<std::vector<double> > [...]?
No. IIUC, you are accessing your data per particle, not per row. If that's the case, why not use a std::vector<particle>, where particle is a struct holding six values? And even if I understood incorrectly, you should rather write a two-dimensional wrapper around a one-dimensional container. Then align your data either in rows or columns - what ever is faster with your access patterns.
Do you think that Blitz is OK, or is it useless in my case?
I have no practical knowledge about blitz++ and the areas it is used in. But isn't blitz++ all about expression templates to unroll loop operations and optimizing away temporaries when doing matrix manipulations? ICBWT.
First of all, you don't want to scatter the coordinates of one given particle all over the place, so I would begin by writing a simple struct:
struct Particle { /* coords */ };
Then we can make a simple one dimensional array of these Particles.
I would probably use a deque, because that's the default container, but you may wish to try a vector, it's just that 1.000.000 of particles means about a single chunk of a few MBs. It should hold but it might strain your system if this ever grows, while the deque will allocate several chunks.
WARNING:
As Alexandre C remarked, if you go the deque road, refrain from using operator[] and prefer to use iteration style. If you really need random access and it's performance sensitive, the vector should prove faster.
The first rule when choosing from containers is to use std::vector. Then, only after your code is complete and you can actually measure performance, you can try other containers. But stick to vector first. (And use reserve() from the start)
Then, you shouldn't use an std::vector<std::vector<double> >. You know the size of your data: it's 6 doubles. No need for it to be dynamic. It is constant and fixed. You can define a struct to hold you particle members (the six doubles), or you can simply typedef it: typedef double particle[6]. Then, use a vector of particles: std::vector<particle>.
Furthermore, as your program uses the particle data contained in the vector sequentially, you will take advantage of the modern CPU cache read-ahead feature at its best performance.
You could go several ways. But in your case, don't declare astd::vector<std::vector<double> >. You're allocating a vector (and you copy it around) for every 6 doubles. Thats way too costly.
If you think that this is the best option, would it be possible to wrap this vector in a way that it can be accessed with a index operator defined as other_array[i,j] // same as other_array[6*i+j] without overhead (like function call at each access) ?
(other_array[i,j] won't work too well, as i,j employs the comma operator to evaluate the value of "i", then discards that and evaluates and returns "j", so it's equivalent to other_array[i]).
You will need to use one of:
other_array[i][j]
other_array(i, j) // if other_array implements operator()(int, int),
// but std::vector<> et al don't.
other_array[i].identifier // identifier is a member variable
other_array[i].identifier() // member function getting value
other_array[i].identifier(double) // member function setting value
You may or may not prefer to put get_ and set_ or similar on the last two functions should you find them useful, but from your question I think you won't: functions are prefered in APIs between parts of large systems involving many developers, or when the data items may vary and you want the algorithms working on the data to be independent thereof.
So, a good test: if you find yourself writing code like other_array[i][3] where you've decided "3" is the double with the speed in it, and other_array[i][5] because "5" is the the acceleration, then stop doing that and give them proper identifiers so you can say other_array[i].speed and .acceleration. Then other developers can read and understand it, and you're much less likely to make accidental mistakes. On the other hand, if you are iterating over those 6 elements doing exactly the same things to each, then you probably do want Particle to hold a double[6], or to provide an operator[](int). There's no problem doing both:
struct Particle
{
double x[6];
double& speed() { return x[3]; }
double speed() const { return x[3]; }
double& acceleration() { return x[5]; }
...
};
BTW / the reason that vector<vector<double> > may be too costly is that each set of 6 doubles will be allocated on the heap, and for fast allocation and deallocation many heap implementations use fixed-size buckets, so your small request will be rounded up t the next size: that may be a significant overhead. The outside vector will also need to record a extra pointer to that memory. Further, heap allocation and deallocation is relatively slow - in you're case, you'd only be doing it at startup and shutdown, but there's no particular point in making your program slower for no reason. Even more importantly, the areas on the heap may just around in memory, so your operator[] may have cache-faults pulling in more distinct memory pages than necessary, slowing the entire program. Put another way, vectors store elements contiguously, but the pointed-to-vectors may not be contiguous.