Difference between linked lists and array of structs? - c++

what's the difference between those pieces of code?
1)
struct MyStruct
{
int num;
} ms[2];
ms[0].num = 5;
ms[1].num = 15;
2)
struct MyStruct
{
int num;
MyStruct *next;
};
MyStruct *ms = new MyStruct;
ms->num = 5;
ms->next = new MyStruct;
ms->next->num = 15;
I'm probably a little confused about linked-lists and lists in general, are they useful to something in particular? Please explain me more.

Your first definition...
struct MyStruct
{
int num;
} ms[1];
...creates a statically allocated array with a single element. You cannot change the size of the array while your program is running; this array will never hold more than one element. You can access items in the array by direct indexing; e.g., ms[5] would get you the sixth element in the array (remember, C and C++ arrays are 0-indexed, so the first element is ms[0]), assuming that you had defined an array of the appropriate size.
Your second definition...
struct MyStruct
{
int num;
MyStruct *next;
};
...creates a dynamically allocated linked list. Memory for this list is allocated dynamically during runtime, and the linked list can grow (or shrink) during the lifetime of the program. Unlike arrays, you cannot directly access any element in the list; to get to the sixth element you have to start at the first element and then iterate 5 times.

Regarding errors you have in your code, the first one constructs a static number of MyStruct elements and store them in ms array, so ms is an array of MyStruct structures, of course in this you meant it to be of 2 elements only, later on you can't add any other element to ms array and though you have limited the number of MyStruct elements, while in the second case when you have a linked list you can chain as many MyStruct elements as you want and this will lead to dynamic number of MyStruct elements, the second case let you add as many MyStruct as you want during the run time, the second case should look like this conceptually in memory:
[ MyStruct#1 ] ----> [ MyStruct#2 ] ----> [ NULL ]
NULL though could be a MyStruct#3 for example, while the first one:
[ MyStruct#1 ] ----> [ MyStruct#2 ]
and that's it, no MyStruct#3 can be added.
Now let's go through the code you wrote:
struct MyStruct
{
int num;
} ms[1];
ms[1] really means create me an ms array of one MyStruct element.
The code next assume you created two:
ms[0].num = 5;
ms[1].num = 15
Hence it should have been:
struct MyStruct
{
int num;
} ms[2];
And it will work fine! and keep in mind the simple illustration I made for it:
[ MyStruct#1 ] ----> [ MyStruct#2 ]
Second Case:
struct MyStruct
{
int num;
MyStruct *next;
};
MyStruct *ms = new MyStruct;
ms->num = 5;
ms->next = new MyStruct;
ms->next->num = 15;
This code uses the C++ operator new if you save your source code as .cpp you'll be able to compile as C++ application with no errors, while for C, the syntax should change like so:
struct MyStruct
{
int num;
MyStruct *next;
};
MyStruct *ms = (MyStruct *) malloc(sizeof MyStruct);
ms->num = 5;
ms->next = (MyStruct *) malloc(sizeof MyStruct);
ms->next->num = 15;
and don't forget to include #include <stdlib.h> for the malloc() function, you can read more about this function here.
And as the first case recall my illustration for the linked-list:
[ MyStruct#1 ] ----> [ MyStruct#2 ] ----> [ NULL ]
Where NULL is actually the next element of the ms->next MyStruct structure, to explain it more recall that ms->next is a pointer of MyStruct and we have allocated it a space in the heap so now it's pointing to a block of memory of the same size of MyStruct structure.
Finally here is a Stackoverflow question about when to use a linked-list and when to use an array so you can get exactly why people all around the world prefer linked-list sometimes and array other times.

Oh, my friend, there are dozens of different kinds of data structures that pretty much just hold a bunch of num values or whatever. The reason programmers don't just use arrays for everything is the differences in the amount of memory required, and the ease of doing whichever operations are most important for your particular needs.
Linked lists happen to be very quick at adding or removing individual items. The trade off is that finding an item in the middle of the list is relatively slow, and the extra memory required by the next pointers. A properly-sized array is very compact in memory, and you can access an item in the middle very quickly, but to add a new item at the end you either have to know the maximum number of elements beforehand, which is often impossible or wastes memory, or reallocate a larger array and copy everything over, which is slow.
Therefore, someone who doesn't know how big their list needs to be, and who mostly only needs to deal with items at the beginning or end of the list or always loops over the entire list, and cares more about execution speed than saving a few bytes of memory, is very likely to choose a linked list over an array.

The main differences between lists and arrays in general:
Ordering in lists is explicit; each element stores the location of the preceding/succeeding element. Ordering in arrays is implicit; each element is assumed to have a preceding/succeeding element. Note that a single list may contain multiple orderings. For example, you could have something likestruct dualList {
T data1;
K data2;
struct dualList *nextT;
struct dualList *nextK;
};
that allows you to order the same list two different ways, one by data1 and the other by data2.
Adjacent array elements are in adjacent memory locations; adjacent list elements don't have to be in adjacent locations.
Arrays offer random access to their elements; lists only offer sequential access (i.e., you have to walk down the list to find an element).
Arrays are (usually) fixed in length1 - adding elements to or removing elements from the array doesn't change the array's size. Lists can grow or shrink as needed.
Lists are great for maintaining a dynamically changing sequence of values, especially if the values need to remain ordered. They're not so hot for storing relatively static data that needs to be retrieved quickly and frequently, since you can't access elements randomly.
You can get around this by declaring memory dynamically, and then use realloc to resize that memory block as needed, but it needs to be done carefully and can be a bit of a PITA.

Linked lists are useful when element ordering is important, and the number of elements is not known in advance. Besides, accessing an element in linked list takes O(n) time. When you look for an element in a list, in the worst case, you'll have to look at every element of a list.
For array, the number must be known in advance. When you define an array in C, you have to pass it its size. On the other hand, accessing an array element takes O(1) time, since an element can be addressed by index. With linked list, that is not possible.
However, that is not C++ related question, since the concept of linked list and array is not tied to C++.

An array is a contiguous pre-allocated block of memory whereas a linked list is a collection of runtime allocated ( malloc ) pieces of memory ( not necessarily contiguous ) linked to each other via pointers ( *next ). You would generally use an array of structs if you know at compile time the maximum number of elements you need to store. A linked list of structs however is useful if you don't know the maximum number of elements that will need to be stored. Also with a linked list the number of elements may change, add and remove elements.

Related

Dynamic, indexed array of structs in c++

How to create dynamic array of structs in c++ with keys that are not continous?
I have a struct like this:
struct Data {
int name;
int val;
}
I'm reading names from file of variable length.
Let's say it is:
1 2
3 5
7 12
In the end I'd like to have an (dynamic) array indexed by name,
like this (pseudo code):
arr[1] = *Data(1,2)
arr[3] = *Data(3,5)
arr[7] = *Data(7,12)
Expected result:
cout << arr[7]->val // outputs 12
cout << arr[3]->val // outputs 5
// array size = 3
How to write this in c++?
(assuming only basic feaures, no vectors, maps etc.)
So far I've tried something like this:
Data *distances = new Data[15]; // explicit size works
// Data *distances = new Data; // this don't
distances[5] = myDataStruct;
This allocates memory for exactly one Data element.
Data *distances = new Data;
This allocates memory for 15 Data elements.
Data *distances = new Data[15];
To get what you want you could either
allocate a maximum number of elements and always stay in this boundary or
allocate a number of elements and for each new element resize the array (either C style malloc/realloc/free or C++ style new/delete, but then for resizing you need to and allocate a new array and copy the old values) or
provide a class that internally stores a list of all elements and exports an operator[] which does not do an array lookup but a find in the list
Naïve implementation (this doesn't support adding new elements into the array, only lookup from an existing one. I'll leave a complete implementation as an exercise to the reader):
struct associative {
Data* data_arr;
Data* end;
Data* operator[](int i) {
return linear_search(data_arr, end, i);
}
};
You could use std::find (with custom predicate) in place of linear_search but if that isn't included in "basic features" by your definition, then you'll just have to re-implement it. Linear search is not difficult to implement.
Few ways this naïve data structure can be improved:
RAII ownership over the pointed memory would make the data structure much safer to use.
Support for adding new elements into the internal array
A better data-structure instead of an array: Binary search tree, or a hash map would allow asymptotically faster lookup.
You are now re-implementing std::map or the unordered variant. Consider not reinventing the wheel.

trim array to elements between i and j

A classic, I'm looking for optimisation here : I have an array of things, and after some processing I know I'm only interested in elements i to j. How to trim my array in the fatset, lightest way, with complete deletions/freeing of memory of elements before i and after j ?
I'm doing mebedded C++, so I may not be able to compile all sorts of library let's say. But std or vector things welcome in a first phase !
I've tried, for array A to be trimmed between i and j, with variable numElms telling me the number of elements in A :
A = &A[i];
numElms = i-j+1;
As it is this yields an incompatibility error. Can that be fixed, and even when fixed, does that free the memory at all for now-unused elements?
A little context : This array is the central data set of my module, and it can be heavy. It will live as long as the module lives. And there's no need to carry dead weight all this time. This is the very first thing that is done - figuring which segment of the data set has to be at all analyzed, and trimming and dumping the rest forever, never to use it again (until the next cycle where we get a fresh array with possibily a compeltely different size).
When asking questions about speed your millage may very based on the size of the array you're working with, but:
Your fastest way will be to not trim the array, just use A[index + i] to find the elements you want.
The lightest way to do this would be to:
Allocate a dynamic array with malloc
Once i and j are found copy that range to the head of the dynamic array
Use realloc to resize the dynamic array to the size j - i + 1
However you have this tagged as C++ not C, so I believe that you're also interested in readability and the required programming investment, not raw speed or weight. If this is true then I would suggest use of a vector or deque.
Given vector<thing> A or a deque<thing> A you could do:
A.erase(cbegin(A), next(cbegin(A), i));
A.resize(j - i + 1);
There is no way to change aloocated memory block size in standard C++ (unless you have POD data — in this case C facilities like realloc could be used). The only way to trim an array is to allocate new array. copy/move needed elements and destroy old array.
You can do it manually, or using vectors:
int* array = new int[10]{0,1,2,3,4,5,6,7,8,9};
std::vector<int> vec {0,1,2,3,4,5,6,7,8,9};
//We want only elements 3-5
{
int* new_array = new int[3];
std::copy(array + 3, array + 6, new_array);
delete[] array;
array = new_array;
}
vec = std::vector<int>(vec.begin()+3, vec.begin()+6);
If you are using C++11, both approaches should have same perfomance.
If you only want to remove extra elements and do not really want to release memory (for example you might want to add more elements later) you can follow NathanOliver link
However, you should consider: do you really need that memory freed immideately? Do you need to move elements right now? Will you array live for such long time that this memory would be lost for your program completely? Maybe you need a range or perharps a view to the array content? In many cases you can store two pointers (or pointer and size) to denote your "new" array, while keeping old one to be released all at once.

Memory fragmentation using std list?

I'm using list of lists to store points data in my appliation.
Here some examples test I made:
//using list of lists
list<list<Point>> ls;
for(int i=0;i<10000;++i)
{
list<Point> lp;
lp.resize(4);
lp.pushback(Point(1,2));
ls.push_back(lp);
}
I asume that memory used will be
10k elements * 5 Points * Point size = 10000*5*2*4=400.000 bytes + some overhead of list container, but memory used by programm rises dramatically.
Is it due to overhead of list container or maybe because of memory fragmentation?
EDIT:
add some info and another example
Point is mfc CPoint class or you can define your own point class with int x,y , I'm using VS2008 in debug mode, Win XP, and Window Task Manager to view memory of application
I can't use vector instead of outer list because I don't know total size N of it beforehand, so I must push_back every new entry.
here is modified example
int N=10000;
list<vector<CPoint>> ls;
for(int i=0;i<N;++i)
{
vector<CPoint> vp;
vp.resize(5);
vp.reserve(5);
ls.push_back(vp);
}
and I compare it to
CPoint* p= new CPoint[N*5];
It's not "+ some overhead of list container". List overhead is linear with the number of objects, not constant. There's 50,000 Points, but with each Point you also have two pointers (std::list is doubly-linked), and also with each element in ls, you have two pointers. Plus, each list is going to have a head and tail pointer.
So that's 140,002 (I think) extra pointers that your math doesn't account for. Note that this dwarfs the size of the Point objects themselves, since they're so small. You sure that list is the right container for you? vector has constant overhead - basically three pointer per container, which would be just 30,003 additional pointers on top of just the Point objects. That's a large memory savings - if that is something that matters.
[Update based on Bill Lynch's comment] vector could allocate more space than 5 for your points. Worst-case, it will allocate twice as much space as you need. But since sizeof(Point) == sizeof(Point*) for you, that's still strictly better than list since list will always use three times as much space.

How to implement a compact linked list with array?

Here is the question of exercise CLRS 10.3-4 I am trying to solve
It is often desirable to keep all elements of a doubly linked list compact in storage,
using, for example, the first m index locations in the multiple-array representation.
(This is the case in a paged, virtual-memory computing environment.) Explain how to implement the procedures ALLOCATE OBJECT and FREE OBJECT so that the representation is compact. Assume that there are no pointers to elements of the linked list outside the list itself. (Hint: Use the array implementation of a stack.)
Here is my soln so far
int free;
int allocate()
{
if(free == n+1)
return 0;
int tmp = free;
free = next[free];
return tmp;
}
int deallocate(int pos)
{
for(;pos[next]!=free;pos[next])
{
next[pos] = next[next[pos]];
prev[pos] = prev[next[pos]];
key[pos] = key[next[pos]];
}
int tmp = free;
free = pos;
next[free] = tmp;
}
Now , The problem is , If this is the case , We don't need linked list. If deletion is O(n) we can implement it using normal array. Secondly I have not used the array implementation of stack too . So where is the catch? How should I start?
You don't have to shrink the list right away. Simply leave a hole and link that hole to your free list. Once you've allocated the memory, it's yours. So let's say your page size is 1K. Your initial allocated list size would then be 1K, even if the list is empty. Now you can add and remove items very effectively.
Then introduce another method to pack your list, i.e. remove all holes. Keep in mind that after calling the pack-method, all 'references' become invalid.

Is it necessary to delete elements as an array shrinks?

I'm a student writing a method that removes zeros from the end of an array of ints, in C++. The array is in a struct, and the struct also has an int that keeps track of the length of the array.
The method examines each element starting from the last, until it encounters the first non-zero element, and marks that one as the "last element" by changing the value of length. Then the method walks back up to the original "last element", deleting those elements that are not out of bounds (the zeros).
The part that deletes the ith element in the array if i is greater than the updated length of the array, looks like this:
if (i > p->length - 1) {
delete (p->elems + i); // free ith elem
That line is wrong, though. Delete takes a pointer, yes? So my feeling is that I need to recover the pointer to the array, and then add i to it so that I will have the memory location of the integer I want to delete.
Is my intuition wrong? Is the error subtle? Or, have I got the entirely wrong idea? I've begun to wonder: do I really need to free these primitives? If they were not primitives I would need to, and in that case, how would I?
have I got the entirely wrong idea?
I'm afraid so.
If you make one new[] call to allocate an array, then you must make one delete[] call to free it:
int *p = new int[10];
...
delete[] p;
If your array is in a struct, and you make one call to allocate the struct, then you must make one call to free it:
struct Foo {
int data[10];
};
Foo *foo = new Foo;
...
delete foo;
There is no way to free part of an array.
An int[10] array actually is 10 integers, in a row (that is, 40 bytes of memory on a 32 bit system, perhaps plus overhead). The integer values which are stored in the array occupy that memory - they are not themselves memory allocations, and they do not need to be freed.
All that said, if you want a variable length array:
that's what std::vector is for
#include <vector>
#include <iostream>
struct Foo {
std::vector<int> vec;
};
int main() {
Foo foo;
// no need for a separate length: the current length of the vector is
std::cout << foo.vec.size() << "\n";
// change the size of the vector to 10 (fills with 0)
foo.vec.resize(10);
// change the size of the vector to 7, discarding the last 3 elements
foo.vec.resize(7);
}
If p->elems is a pointer, then so is p->elems + i (assuming the operation is defined, i.e. i is of integral type) - and p->elems + i == &p->elems[i]
Anyhow, you most likely don't want to (and cannot) delete ints from an array of int (be it dynamically or automatically allocated). That is
int* ptr = new int[10];
delete &ptr[5]; // WRONG!
That is simply something you cannot do. However, if the struct contains the length of the array, you could consider the array "resized" after you change the length information contained by the struct - after all, there is no way to tell the size of the array a pointer points to.
If, however your array is an array of pointers to integers (int*[]) and these pointers point to dynamically allocated memory, then yes, you could delete single items and you'd do it along the lines of your code (you are showing so little code it's difficult to be exact).