Pre-allocating memory for linked list [closed] - c++

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 2 years ago.
Improve this question
In a technical interview, the guy asked me that "he wants to pre-allocate memory for a linked list just like we can do for array", so how would he do that?
I have never felt the need to, neither came across this thought! I mostly code in C++, and I answered something like "just like we use the new command in C++ for memory allocation, example int *p = new int[10] ,where I can allocate 40 bytes of memory, I'd do something same for my Linked List, like Node *p = new Node()[10] , where Node is my Linked List class name, which is like this:
class Node{
public:
int data;
Node *next;
};
".
Then he further followed it up with how would you go about implementing this and would it really save time, considering space is not an issue? I mainly fumbled my way through the answer and he moved on the next question.
But I'd really like to know now if I was correct and a small example of it's working/operation would really help. Thank you.

Interview questions are generally asked not to be answered directly, and it is expected that you narrow down the use-case and requirements.
he wants to pre-allocate memory for a linked list just like we can do for array
If that is actually the question, then the interviewer either intentionally asked it wrong or misleading. And array (std::array or c-style array) will not only allocate the memory for the types they store but also construct them (at least for non-primitives) so it is important to know if it is a general-purpose list or a specialist list for certain types. A std::vector, on the other hand, actually pre-allocates memory.
You generally want to minimize the number of individual memory allocations because those can be expensive.
I'd do something same for my Linked List, like Node *p = new Node()[10]
You don't want to do that because this would already construct the type managed by the list for each node. In the case of primitives, this won't be much of a problem, but would horribly fail for a general-purpose list like std::list.
Then he further followed it up with how would you go about implementing this and would it really save time, considering space is not an issue?
You would allocate a larger chunk of memory (similar to what std::vector does), and when an element is stored in the list, you will use placment new, to construct the node in the already pre-allocated space.
If space is not a problem and a list would pre-allocate space for, e.g. 100 elements, it would save 99 memory allocations per 100 stored objects. You surely need to add some cost for manually keeping track of which parts of the pre-allocated spaces are free and which one is not, but that is likely to be cheaper than allocating memory.
This is just a rough idea about pre-allocating memory for a list. But the question is missing too many pieces of information to answer it in a meaningful full way.

how would you go about implementing this
We sure can create nodes without actually storing data in it. We can use a constructor (of the linked list) to get it done.
class LinkedList {
public:
LinkedList(int n)
{
pRootNode = new Node();
Node* pTraveler = pRootNode;
for(int i = 0; i < n; i++)
{
pTraveler->next = new Node();
pTraveler = pTraveler->next;
}
}
I'd do something same for my Linked List, like Node *p = new Node()[10]
This will give you an array of nodes. You further need to process it so that the previous node contains the pointer to the next.
would it really save time
A linked list like this will improve insertions as we don't need to allocate new nodes (until a new node is needed) when inserting a new entry. But instantiation of the linked list will take a small time (comparatively) as we are allocating nodes in the constructor.
Linked lists are said to have O(1) insertions and deletions with a worst case of O(n) access/ lookup time. So in my opinion, pre-allocating will have little effect because you'll anyway spend an equal amount of time allocating nodes.

Related

Can we implement a doubly-linked list using a single pointer? [duplicate]

This question already has answers here:
How to implement a double linked list with only one pointer?
(6 answers)
Closed 7 years ago.
I want to use a structure like:
struct node {
char[10] tag;
struct node *next;
};
I want to use the above structure to create a doubly-linked list. Is that possible and if yes, then how I can achieve it?
Yes, it's possible, but it's a dirty hack.
It's called XOR linked list. (https://en.wikipedia.org/wiki/XOR_linked_list)
Each node stores a XOR of next and prev as a uintptr_t.
Here is an example:
#include <cstddef>
#include <iostream>
struct Node
{
int num;
uintptr_t ptr;
};
int main()
{
Node *arr[4];
// Here we create a new list.
int num = 0;
for (auto &it : arr)
{
it = new Node;
it->num = ++num;
}
arr[0]->ptr = (uintptr_t)arr[1];
arr[1]->ptr = (uintptr_t)arr[0] ^ (uintptr_t)arr[2];
arr[2]->ptr = (uintptr_t)arr[1] ^ (uintptr_t)arr[3];
arr[3]->ptr = (uintptr_t)arr[2];
// And here we iterate over it
Node *cur = arr[0], *prev = 0;
do
{
std::cout << cur->num << ' ';
prev = (Node *)(cur->ptr ^ (uintptr_t)prev);
std::swap(cur, prev);
}
while (cur);
return 0;
}
It prints 1 2 3 4 as expected.
I'd like to offer an alternative answer which boils down to "yes and no".
First, it's "sort of impossible" if you want to get the full benefits of a doubly-linked list with only one single pointer per node.
XOR List
Yet cited here was also the XOR linked list. It retains one main benefit by having a lossy compression of two pointers fitting into one that you lose with a singly-linked list: the ability to traverse it in reverse. It cannot do things like remove elements from the middle of the list in constant-time given only the node address, and being able to go back to a previous element in a forward iteration and removing an arbitrary element in linear time is even simpler without the XOR list (you likewise keep two node pointers there: previous and current).
Performance
Yet also cited in the comments was a desire for performance. Given that, I think there are some practical alternatives.
First, a next/prev pointer in a doubly-linked list doesn't have to be, say, a 64-bit pointer on 64-bit systems. It can be two indices into a 32-bit contiguous address space. Now you got two indices for the memory price of one pointer. Nevertheless, trying to emulate 32-bit addressing on 64-bit is quite involved, maybe not exactly what you want.
However, to reap the full performance benefits of a linked structure (trees included) often requires you to get back control over how the nodes are allocated and distributed in memory. Linked structures tend to be bottlenecky because, if you just use malloc or plain operator new for every node, e.g., you lose control over memory layout. Often (not always -- you can get lucky depending on the memory allocator, and whether you allocate all your nodes at once or not) this means a loss of contiguity, which means a loss of spatial locality.
It's why data-oriented design stresses arrays more than anything else: linked structures are not normally very friendly for performance. The process of moving chunks from larger memory to smaller, faster memory likes it if you are going to access neighboring data within the same chunk (cache line/page, e.g.) prior to eviction.
The Not-So-Often-Cited Unrolled List
So there's a hybrid solution here which is not so often discussed, which is the unrolled list. Example:
struct Element
{
...
};
struct UnrolledNode
{
struct Element elements[32];
struct UnrolledNode* prev;
struct UnrolledNode* next;
};
An unrolled list combines the characteristics of arrays and doubly-linked lists all into one. It'll give you back a lot of spatial locality without having to look to the memory allocator.
It can traverse forwards and backwards, it can remove arbitrary elements from the middle at any given time for cheap.
And it reduces the linked list overhead to the absolute minimum: in this case I hard-coded an unrolled array size of 32 elements per node. That means the cost of storing the list pointers has shrunk to 1/32th of its normal size. That's even cheaper from a list pointer overhead standpoint than a singly-linked list, with often faster traversal (because of cache locality).
It's not a perfect replacement for a doubly-linked list. For a start, if you are worried about the invalidation of existing pointers to elements in the list on removal, then you have to start worrying about leaving vacant spaces (holes/tombstones) behind that get reclaimed (possibly by associating free bits in each unrolled node). At that point you're dealing with many similar concerns of implementing a memory allocator, including some minor forms of fragmentation (ex: having an unrolled node with 31 vacant spaces and only one element occupied -- the node still has to stay around in memory to avoid invalidation until it becomes completely empty).
An "iterator" to it which allows insertion/removal to/from the middle typically has to be larger than a pointer (unless, as noted in the comments, you store additional metadata with each element). It can waste memory (typically moot unless you have really teeny lists) by requiring, say, the memory for 32 elements even if you have a list of only 1 element. It does tend to be a little more complex to implement than any of these above solutions. But it's a very useful solution in a performance-critical scenario, and often one that probably deserves more attention. It's one that's not brought up so much in computer science since it doesn't do any better from an algorithmic perspective than a regular linked list, but locality of reference has a significant impact on performance as well in real-world scenarios.
It's not completely possible. A doubly linked list requires two pointers, one for the link in each direction.
Depending on what you need the XOR linked list may do what you need (see HolyBlackCat's answer).
Another option is to work around this limitation a little by doing things like remembering the last node you processed as you iterate through the list. This will let you go back one step during the processing but it doesn't make the list doubly linked.
You can declare and support two initial pointers to nodes head and tail. In this case you will be able to add nodes to the both ends of the list.
Such a list sometimes called a two-sided list.
However the list itself will be a forward list.
Using such a list you can for example simulate a queue.
It is not possible in a portable way without invoking undefined behavior:
Can an XOR linked list be implemented in C++ without causing undefined behavior?

Create dynamic array of objects

I want to create a dynamic array of a specific object that would also support adding new objects to the array.
I'm trying to solve this as part of an exercise in my course. In this exercise we are not supposed to use std::vector.
For example, let's say I have a class named Product and declare a pointer:
Products* products;
then I want to support the following:
products = new Product();
/* code here... */
products[1] = new Product(); // and so on...
I know the current syntax could lead to access violation. I don't know the size of the array in advance, as it can change throughout the program.
The questions are:
How can I write it without vectors?
Do I have to use double pointers (2-dimension)?
Every time I want to add a new object, do I have to copy the array to the new array (with +1 size), and then delete the array?
You should not write this without std::vector. If you for some reason need to, your copying with every resize is by far the easiest option.
I do not see how that would help. (I.e. no)
As mentioned above, this is by far the easiest method if you cannot use std::vector. Everything else would be (partially) reinventing one standard library container or the other, which is hard.
You have to use your own memory memory management, i.e. more specifically wrt your other (related) questions:
No, if you have a contiguous allocated chunk of memory where your data lives in.
Yes, if 2. is your desired implementation method. However, if you don't want to use a large memory chunk, you have to use a (double) linked list which does not require you to copy the whole array every time.
I see people already answered your specific questions, so I'll answer a more general answer.
You must implement a way to do it by yourself, but there are lots of Abstract Data Types that you can use, as far as I can see the simplest would be a linked list, such as the following:
class ProductNode
{
public:
ProductNode() : _data(NULL), _next(NULL)
{
}
void setProduct(Product* p); //setter for the product pointer
{
this->_data = p;
}
Product getProduct(); //getter for the product pointer
{
return *(this->_data);
}
void addNext(); //allocate memory for another ProductNode in '_next'
{
if(!next)
{
this->_next = new ProductNode();
}
}
ProductNode* getNext(); //get the allocated memory, the address that is in '_next'
{
return this->_next;
}
~ProductNode(); //delete every single node from that node and forward, it'll be recursive for a linked list
private:
Product* _data;
ProductNode* _next;
}
Declare a head variable and go from there.
Of course that most of the functions here should be implemented otherwise, it was coded quickly so you could see the basics that you need for this assignment.
That's one way.
Also you can make your own data type.
Or use some others data types for abstraction of the data.
What you probably should do (i.e. what I believe you're expected to do) is write your own class that represents a dynamic array (i.e. you're going to reinvent parts of std::vector.)
Despite what many around here say, this is a worthwhile exercise and should be part of a normal computer science curriculum.
Use a dynamically allocated array which is a member of your class.
If you're using a dynamically allocated array of Product*, you'll be storing a Product**, so yes, in a way. It's not necessary to have "double pointers" for the functionality itself, though.
Technically no - you can allocate more than necessary and only reallocate and copy when you run out of space. This is what vector does to improve its time complexity.
Expanding for each element is a good way to start though, and you can always change your strategy later.
(Get the simple way working first, get fancy if you need to.)
It's probably easiest to first implement an array of int (or some other basic numerical type) and make sure that it works, and then change the type of the contents.
I suppose by "Products *products;" you mean "Products" is a vector-like container.
1) How can I write it without vectors?
As a linked list. Instantiating a "Products products" will give you an empty linked list.
Overriding the operator[] will insert/replace the element in the list. So you need to scan the list to find the right place. If several elements are missing until you got the right place, you may need to append those "neutral" elements before your element. Doing so, through "Product *products" is not feasible if you plan to override the operator[] to handle addition of elements, unless you declare "Products products" instead
2) Do I have to use double pointers (2-dimension)?
This question lacks of precision. As in "typedef Product *Products;" then "Products *products" ? as long as you maintained a " * " between "Products" and "products", there is no way to override operator[] to handle addition of element.
3) Every time I want to add a new object, do I have to copy the array to the new array (with +1 size), and then delete the array?
If you stick with array, you can use a O(log2(n)) time reallocation, by simply growing twice the array size (and supposedly you have a zero-terminal or a count embedded). Or just use a linked list instead to avoid any copy of all elements before adding an element.

Implement a heap not using an array

I'm prepping for a Google developer interview and have gotten stuck on a question about heaps. I need to implement a heap as a dynamic binary tree (not array) where each node has a pointer to the parent and two children and there is a global pointer to the root node. The book asks "why won't this be enough?"
How can the standard tree implementation be extended to support heap operations add() and deleteMin()? How can these operations be implemented in this data structure?
Can you keep the size of total nodes ? if so, it's easy to know where you should add new element, because that's an almost full tree.
About deleteMin, I think that it will be less effective because you can't access directly to all leaves, as in array (N/2).
You should travel through all paths till you get leaf and then compare them, probably it will cost O(n)

The fastest dynamic data structure in C++ [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
I've got a task which mainly consist in adding or removing elements from an array in C++. Since arrays ain't dynamic but operations on them are very fast, I've been looking for a dynamic data structure which is nearly as fast to operate. I've been thinking about std::vector but since it is predefined and quite massive construct I'm afraid about time of the operations which is crucial for me. Could Anybody provide me with some information about Your point of view? I'd be very glad for any help from You!
edited:
I'm really sorry I haven't included all important point in my question; below I'd try to add more info:
I'll be traversing elements of the structure many times and access them in a random manner so operation on elements on every possible positions are possible
I think that there will be (depending on tests provided) many operations on elements in the middle of the data structure as well as near its "brims".
I believe that will help my post to be more clear, specific and, thus, more useful for others.
Thank You for all the answers!
Refer to Mikael Persson's "container choice" diagram:
http://www.cim.mcgill.ca/~mpersson/pics/STLcontainerChoices.png
The different data structures were implemented in the STL to be used for different reasons. Therefore the structures differ when it comes to insertion/deletion speeds at the start, the middle or the end of the structures or even when it comes to the random access of the structure elements.
A nice short comparison of STL containers:
http://john-ahlgren.blogspot.com/2013/10/stl-container-performance.html
If it's possible for you to use an associative array, maps at least guarantee an insertion/look-up time of O(log n) which is a good bit faster for large amounts of data/lots of insertions and deletes than vector's guarantee of O(n) for non-back insertions.
Not sure if they will work here or not, this link also shows some graphs of benchmarks using random insert/removes/searches/fills/sorts, etc. on several different containers:
http://www.baptiste-wicht.com/2012/12/cpp-benchmark-vector-list-deque/
Lastly, a flow chart from SO that could help you decide on a container:
In which scenario do I use a particular STL container?
While not perfect, it still might turn out that a vector is your best bet.
Will a linked list implemented using an array meet your needs?
class AList
{
public:
AList()
{
for (int = 0; i != 256; ++i )
{
nodes[i].prev = (i-1+256)%256;
nodes[i].next = (i+1)%256;
}
}
int const& operator[](int index)
{
// Deal with the case where nodes[index].isSet == false
return nodes[index].data;
}
// Not sure what the requirements are for adding
// and removing items from the list.
//
// add();
// remove();
private:
struct Node
{
Node() : data(0), prev(0), next(0), isSet(false) {}
int data;
unsigned char prev;
unsigned char next;
bool isSet;
};
Node nodes[256];
};

Vector versus dynamic array, does it make a big difference in speed?

Now I am writing some code for solving vehicle routing problems. To do so, one important decision is to choose how to encode the solutions. A solution contains several routes, one for each vehicle. Each route has a customer visiting sequence, the load of route, the length of route.
To perform modifications on a solution the information, I also need to quickly find some information.
For example,
Which route is a customer in?
What customers does a route have?
How many nodes are there in a route?
What nodes are in front of or behind a node?
Now, I am thinking to use the following structure to keep a solution.
struct Sol
{
vector<short> nextNode; // show what is the next node of each node;
vector<short> preNode; //show what is the preceding node
vector<short> startNode;
vector<short> rutNum;
vector<short> rutLoad;
vector<float> rutLength;
vector<short> rutSize;
};
The common size of each vector is instance dependent, between 200-2000.
I heard it is possible to use dynamic array to do this job. But it seems to me dynamic array is more complicated. One has to locate the memory and release the memory. Here my question is twofold.
How to use dynamic array to realize the same purpose? how to define the struct or class so that memory location and release can be easily taken care of?
Will using dynamic array be faster than using vector? Assuming the solution structure need to be accessed million times.
It is highly unlikely that you'll see an appreciable performance difference between a dynamic array and a vector since the latter is essentially a very thin wrapper around the former. Also bear in mind that using a vector would be significantly less error-prone.
It may, however, be the case that some information is better stored in a different type of container altogether, e.g. in an std::map. The following might be of interest: What are the complexity guarantees of the standard containers?
It is important to give some thought to the type of container that gets used. However, when it comes to micro-optimizations (such as vector vs dynamic array), the best policy is to profile the code first and only focus on parts of the code that prove to be real -- rather than assumed -- bottlenecks.
It's quite possible that vector's code is actually better and more performant than dynamic array code you would write yourself. Only if profiling shows significant time spent in vector would I consider writing my own error-prone replacement. See also Dynamically allocated arrays or std::vector
I'm using MSVC and the implementation looks to be as quick as it can be.
Accessing the array via operator [] is:
return (*(this->_Myfirst + _Pos));
Which is as quick as you are going to get with dynamic memory.
The only overhead you are going to get is in the memory use of a vector, it seems to create a pointer to the start of the vector, the end of the vector, and the end of the current sequence. This is only 2 more pointers than you would need if you were using a dynamic array. You are only creating 200-2000 of these, I doubt memory is going to be that tight.
I am sure the other stl implementations are very similar. I would absorb the minor cost of vector storage and use them in your project.