Dynamic, indexed array of structs in c++ - c++

How to create dynamic array of structs in c++ with keys that are not continous?
I have a struct like this:
struct Data {
int name;
int val;
}
I'm reading names from file of variable length.
Let's say it is:
1 2
3 5
7 12
In the end I'd like to have an (dynamic) array indexed by name,
like this (pseudo code):
arr[1] = *Data(1,2)
arr[3] = *Data(3,5)
arr[7] = *Data(7,12)
Expected result:
cout << arr[7]->val // outputs 12
cout << arr[3]->val // outputs 5
// array size = 3
How to write this in c++?
(assuming only basic feaures, no vectors, maps etc.)
So far I've tried something like this:
Data *distances = new Data[15]; // explicit size works
// Data *distances = new Data; // this don't
distances[5] = myDataStruct;

This allocates memory for exactly one Data element.
Data *distances = new Data;
This allocates memory for 15 Data elements.
Data *distances = new Data[15];
To get what you want you could either
allocate a maximum number of elements and always stay in this boundary or
allocate a number of elements and for each new element resize the array (either C style malloc/realloc/free or C++ style new/delete, but then for resizing you need to and allocate a new array and copy the old values) or
provide a class that internally stores a list of all elements and exports an operator[] which does not do an array lookup but a find in the list

Naïve implementation (this doesn't support adding new elements into the array, only lookup from an existing one. I'll leave a complete implementation as an exercise to the reader):
struct associative {
Data* data_arr;
Data* end;
Data* operator[](int i) {
return linear_search(data_arr, end, i);
}
};
You could use std::find (with custom predicate) in place of linear_search but if that isn't included in "basic features" by your definition, then you'll just have to re-implement it. Linear search is not difficult to implement.
Few ways this naïve data structure can be improved:
RAII ownership over the pointed memory would make the data structure much safer to use.
Support for adding new elements into the internal array
A better data-structure instead of an array: Binary search tree, or a hash map would allow asymptotically faster lookup.
You are now re-implementing std::map or the unordered variant. Consider not reinventing the wheel.

Related

Which container should I use for random access, cheap addition and removal (without de/allocation), with a known maximum size?

I need the lighter container that must store till 128 unsigned int.
It must add, edit and remove each element accessing it quickly, without allocating new memory every time (I already know it will be max 128).
Such as:
add int 40 at index 4 (1/128 item used)
add int 36 at index 90 (2/128 item used)
edit to value 42 the element at index 4
add int 36 at index 54 (3/128 item used)
remove element with index 90 (2/128 item used)
remove element with index 4 (1/128 item used)
... and so on. So every time I can iterate trought only the real number of elements added to the container, not all and check if NULL or not.
During this process, as I said, it must not allocating/reallocating new memory, since I'm using an app that manage "audio" data and this means a glitch every time I touch the memory.
Which container would be the right candidate?
It sounds like a "indexes" queue.
As I understand the question, you have two operations
Insert/replace element value at cell index
Delete element at cell index
and one predicate
Is cell index currently occupied?
This is an array and a bitmap. When you insert/replace, you stick the value in the array cell and set the bitmap bit. When you delete, you clear the bitmap bit. When you ask, you query the bitmap bit.
You can just use std::vector<int> and do vector.reserve(128); to keep the vector from allocating memory. This doesn't allow you to keep track of particular indices though.
If you need to keep track of an 'index' you could use std::vector<std::pair<int, int>>. This doesn't allow random access though.
If you only need cheap setting and erasing values, just use an array. You
can keep track of what cells are used by marking them in another array (or bitmap). Or by just defining one value (e.g. 0 or -1) as an "unused" value.
Of course, if you need to iterate over all used cells, you need to scan the whole array. But that's a tradeoff you need to make: either do more work during adding and erasing, or do more work during a search. (Note that an .insert() in the middle of a vector<> will move data around.)
In any case, 128 elements is so few, that a scan through the whole array will be negligible work. And frankly, I think anything more complex than a vector will be total overkill. :)
Roughly:
unsigned data[128] = {0}; // initialize
unsigned used[128] = {0};
data[index] = newvalue; used[index] = 1; // set value
data[index] = used[index] = 0; // unset value
// check if a cell is used and do something
if (used[index]) { do something } else { do something else }
I'd suggest a tandem of vectors, one to hold the active indices, the other to hold the data:
class Container
{
std::vector<size_t> indices;
std::vector<int> data;
size_t index_worldToData(size_t worldIndex) const
{
auto it = std::lower_bound(begin(indices), end(indices), worldIndex);
return it - begin(indices);
}
public:
Container()
{
indices.reserve(128);
data.reserve(128);
}
int& operator[] (size_t worldIndex)
{
return data[index_worldToData(worldIndex)];
}
void addElement(size_t worldIndex, int element)
{
auto dataIndex = index_worldToData(worldIndex);
indices.insert(it, worldIndex);
data.insert(begin(data) + dataIndex, element);
}
void removeElement(size_t worldIndex)
{
auto dataIndex = index_worldToData(worldIndex);
indices.erase(begin(indices) + dataIndex);
data.erase(begin(indices) + dataIndex);
}
class iterator
{
Container *cnt;
size_t dataIndex;
public:
int& operator* () const { return cnt.data[dataIndex]; }
iterator& operator++ () { ++dataIndex; }
};
iterator begin() { return iterator{ this, 0 }; }
iterator end() { return iterator{ this, indices.size() }; }
};
(Disclaimer: code not touched by compiler, preconditions checks omitted)
This one has logarithmic time element access, linear time insertion and removal, and allows iterating over non-empty elements.
You could use a doubly-linked list and an array of node pointers.
Preallocate 128 list nodes and keep them on freelist.
Create a empty itemlist.
Allocate an array of 128 node pointers called items
To insert at i: Pop the head node from freelist, add it to
itemlist, set items[i] to point at it.
To access/change a value, use items[i]->value
To delete at i, remove the node pointed to by items[i], reinsert it in 'freelist'
To iterate, just walk itemlist
Everything is O(1) except iteration, which is O(Nactive_items). Only caveat is that iteration is not in index order.
Freelist can be singly-linked, or even an array of nodes, as all you need is pop and push.
class Container {
private:
set<size_t> indices;
unsigned int buffer[128];
public:
void set_elem(const size_t index, const unsigned int element) {
buffer[index] = element;
indices.insert(index);
}
// and so on -- iterate over the indices if necessary
};
There are multiple approaches that you can use, I will cite them in order of effort expended.
The most affordable solution is to use the Boost non-standard containers, of particular interest is flat_map. Essentially, a flat_map offers the interface of a map over the storage provided by a dynamic array.
You can call its reserve member at the start to avoid memory allocation afterward.
A slightly more involved solution is to code your own memory allocator.
The interface of an allocator is relatively easy to deal with, so that coding an allocator is quite simple. Create a pool-allocator which will never release any element, warm it up (allocate 128 elements) and you are ready to go: it can be plugged in any collection to make it memory-allocation-free.
Of particular interest, here, is of course std::map.
Finally, there is the do-it-yourself road. Much more involved, quite obviously: the number of operations supported by standard containers is just... huge.
Still, if you have the time or can live with only a subset of those operations, then this road has one undeniable advantage: you can tailor the container specifically to your needs.
Of particular interest here is the idea of having a std::vector<boost::optional<int>> of 128 elements... except that since this representation is quite space inefficient, we use the Data-Oriented Design to instead make it two vectors: std::vector<int> and std::vector<bool>, which is much more compact, or even...
struct Container {
size_t const Size = 128;
int array[Size];
std::bitset<Size> marker;
}
which is both compact and allocation-free.
Now, iterating requires iterating the bitset for present elements, which might seem wasteful at first, but said bitset is only 16 bytes long so it's a breeze! (because at such scale memory locality trumps big-O complexity)
Why not use std::map<int, int>, it provides random access and is sparse.
If a vector (pre-reserved) is not handy enough, look into Boost.Container for various “flat” varieties of indexed collections. This will store everything in a vector and not need memory manipulation, but adds a layer on top to make it a set or map, indexable by which elements are present and able to tell which are not.

Expanding size of a list

I am writing a class that holds strings. It is basically a container that will hold strings. I am wondering how to expand the size of the container as the container grows larger.
Right now I have an array holding the strings and the size of the array is set at 10. I've thought about creating a two dimensional array but since the size would be arbitrarily assigned anyway, do not think that would make any difference.
class stringlist {
public:
typedef std::string str;
void push(str);
void pop();
void print();
private:
str container[10];
};
void stringlist::push(str s)
{
size_t sz = sizeof(container) / sizeof(*container);
str* ptr = container;
while(ptr[sz] != "" && *ptr != "")
++ptr;
*ptr = s;
}
void stringlist::pop()
{
size_t sz = sizeof(container) / sizeof(*container);
str* ptr = container;
while(ptr != ptr + sz)
++ptr;
*ptr = "";
}
void stringlist::print()
{
size_t sz = sizeof(container) / sizeof(*container);
str* ptr = container;
while(ptr[sz] != "" && *ptr != "")
std::cout << *ptr++ << " ";
std::cout << std::endl;
}
EDIT
Basically I am looking for some kind of dynamic memory allocation. str* container = new str[N] where N can be specified. But I am not sure how to implement without knowing N beforehand.
If I use constructors I get an error:
public:
stringlist() : N(15) {}
stringlist(size_t sz) : N(sz) {}
private:
str* container = new str[N];
size_t N;
ERROR
a.out(29866,0x7fff76388310) malloc: *** mach_vm_map(size=3377629375143936) failed (error code=3)
*** error: can't allocate region
*** set a breakpoint in malloc_error_break to debug
libc++abi.dylib: terminating with uncaught exception of type std::bad_alloc: std::bad_alloc
EDIT
I actually got it to work with the constructor method. I had the value of N begin set after allocating the container which gave me the error. I switched the order and works.
NOTE: This does not yet solve the problem of growing the list as I add elements.
_____________________________________________________________________________________
REFLECTION
It seems that if I change the value of the data member N, the size of the array reallocates to that size. I thought that since the array was created when the object was created the array won't change sizes but after writing and running some functions it is doing exactly that.
str container[10]; is a fixed size array of ten strings. This means your class will have 10 strings in it even if you haven't added any yet.
You might want consider using a std::vector to store your strings in, inside your class. This will allow you to grow and shrink the container as you see fit.
However if you're going to use a stl container inside your class, I don't see why you wouldn't just using a std::list to start with.
If you only need a stack-like container (where the only useful operations are pushing, poping and possibly iterating) you might want to use a linked list instead of an indexed one.
More on linked lists here
Otherwise you'll have to expand the list and copy all objects from the old list to the new one whenever you "grow out" of the current list.
Arrays are static data structures, meaning they have fixed data size. You can either allocate dynamically memory by creating & using the appropriate data structure for your cause, OR (recommended) you can use STL's Generic classes, such as vector, list, map, etc.
It seems that if I change the value of the data member N, the size of the array reallocates to that size. I thought that since the array was created when the object was created the array won't change sizes but after writing and running some functions it is doing exactly that.
Unless I'm misunderstanding you, it sounds like you mean that you get the desired behavior simply by increasing N.
If you don't reallocate the array to the new size you'll end up with undefined behavior, sometimes working and sometimes crashing. Should another object be assigned the space right behind your array it'll be overwritten if you keep adding elements to your list.
Your best bet similar to above like what JBarberU said is a linked list. However if you would like it to be indexed you could create a dynamic array class that has a size and a capacity, every time you add a string to the array increment the size and if you are going to add to the array and it would exceed the size then create a new array, twice as large and copy all of the values over.

Difference between linked lists and array of structs?

what's the difference between those pieces of code?
1)
struct MyStruct
{
int num;
} ms[2];
ms[0].num = 5;
ms[1].num = 15;
2)
struct MyStruct
{
int num;
MyStruct *next;
};
MyStruct *ms = new MyStruct;
ms->num = 5;
ms->next = new MyStruct;
ms->next->num = 15;
I'm probably a little confused about linked-lists and lists in general, are they useful to something in particular? Please explain me more.
Your first definition...
struct MyStruct
{
int num;
} ms[1];
...creates a statically allocated array with a single element. You cannot change the size of the array while your program is running; this array will never hold more than one element. You can access items in the array by direct indexing; e.g., ms[5] would get you the sixth element in the array (remember, C and C++ arrays are 0-indexed, so the first element is ms[0]), assuming that you had defined an array of the appropriate size.
Your second definition...
struct MyStruct
{
int num;
MyStruct *next;
};
...creates a dynamically allocated linked list. Memory for this list is allocated dynamically during runtime, and the linked list can grow (or shrink) during the lifetime of the program. Unlike arrays, you cannot directly access any element in the list; to get to the sixth element you have to start at the first element and then iterate 5 times.
Regarding errors you have in your code, the first one constructs a static number of MyStruct elements and store them in ms array, so ms is an array of MyStruct structures, of course in this you meant it to be of 2 elements only, later on you can't add any other element to ms array and though you have limited the number of MyStruct elements, while in the second case when you have a linked list you can chain as many MyStruct elements as you want and this will lead to dynamic number of MyStruct elements, the second case let you add as many MyStruct as you want during the run time, the second case should look like this conceptually in memory:
[ MyStruct#1 ] ----> [ MyStruct#2 ] ----> [ NULL ]
NULL though could be a MyStruct#3 for example, while the first one:
[ MyStruct#1 ] ----> [ MyStruct#2 ]
and that's it, no MyStruct#3 can be added.
Now let's go through the code you wrote:
struct MyStruct
{
int num;
} ms[1];
ms[1] really means create me an ms array of one MyStruct element.
The code next assume you created two:
ms[0].num = 5;
ms[1].num = 15
Hence it should have been:
struct MyStruct
{
int num;
} ms[2];
And it will work fine! and keep in mind the simple illustration I made for it:
[ MyStruct#1 ] ----> [ MyStruct#2 ]
Second Case:
struct MyStruct
{
int num;
MyStruct *next;
};
MyStruct *ms = new MyStruct;
ms->num = 5;
ms->next = new MyStruct;
ms->next->num = 15;
This code uses the C++ operator new if you save your source code as .cpp you'll be able to compile as C++ application with no errors, while for C, the syntax should change like so:
struct MyStruct
{
int num;
MyStruct *next;
};
MyStruct *ms = (MyStruct *) malloc(sizeof MyStruct);
ms->num = 5;
ms->next = (MyStruct *) malloc(sizeof MyStruct);
ms->next->num = 15;
and don't forget to include #include <stdlib.h> for the malloc() function, you can read more about this function here.
And as the first case recall my illustration for the linked-list:
[ MyStruct#1 ] ----> [ MyStruct#2 ] ----> [ NULL ]
Where NULL is actually the next element of the ms->next MyStruct structure, to explain it more recall that ms->next is a pointer of MyStruct and we have allocated it a space in the heap so now it's pointing to a block of memory of the same size of MyStruct structure.
Finally here is a Stackoverflow question about when to use a linked-list and when to use an array so you can get exactly why people all around the world prefer linked-list sometimes and array other times.
Oh, my friend, there are dozens of different kinds of data structures that pretty much just hold a bunch of num values or whatever. The reason programmers don't just use arrays for everything is the differences in the amount of memory required, and the ease of doing whichever operations are most important for your particular needs.
Linked lists happen to be very quick at adding or removing individual items. The trade off is that finding an item in the middle of the list is relatively slow, and the extra memory required by the next pointers. A properly-sized array is very compact in memory, and you can access an item in the middle very quickly, but to add a new item at the end you either have to know the maximum number of elements beforehand, which is often impossible or wastes memory, or reallocate a larger array and copy everything over, which is slow.
Therefore, someone who doesn't know how big their list needs to be, and who mostly only needs to deal with items at the beginning or end of the list or always loops over the entire list, and cares more about execution speed than saving a few bytes of memory, is very likely to choose a linked list over an array.
The main differences between lists and arrays in general:
Ordering in lists is explicit; each element stores the location of the preceding/succeeding element. Ordering in arrays is implicit; each element is assumed to have a preceding/succeeding element. Note that a single list may contain multiple orderings. For example, you could have something likestruct dualList {
T data1;
K data2;
struct dualList *nextT;
struct dualList *nextK;
};
that allows you to order the same list two different ways, one by data1 and the other by data2.
Adjacent array elements are in adjacent memory locations; adjacent list elements don't have to be in adjacent locations.
Arrays offer random access to their elements; lists only offer sequential access (i.e., you have to walk down the list to find an element).
Arrays are (usually) fixed in length1 - adding elements to or removing elements from the array doesn't change the array's size. Lists can grow or shrink as needed.
Lists are great for maintaining a dynamically changing sequence of values, especially if the values need to remain ordered. They're not so hot for storing relatively static data that needs to be retrieved quickly and frequently, since you can't access elements randomly.
You can get around this by declaring memory dynamically, and then use realloc to resize that memory block as needed, but it needs to be done carefully and can be a bit of a PITA.
Linked lists are useful when element ordering is important, and the number of elements is not known in advance. Besides, accessing an element in linked list takes O(n) time. When you look for an element in a list, in the worst case, you'll have to look at every element of a list.
For array, the number must be known in advance. When you define an array in C, you have to pass it its size. On the other hand, accessing an array element takes O(1) time, since an element can be addressed by index. With linked list, that is not possible.
However, that is not C++ related question, since the concept of linked list and array is not tied to C++.
An array is a contiguous pre-allocated block of memory whereas a linked list is a collection of runtime allocated ( malloc ) pieces of memory ( not necessarily contiguous ) linked to each other via pointers ( *next ). You would generally use an array of structs if you know at compile time the maximum number of elements you need to store. A linked list of structs however is useful if you don't know the maximum number of elements that will need to be stored. Also with a linked list the number of elements may change, add and remove elements.

Adding element to Array of Objects in C++

How do I add an element to the end of an array dynamically in C++?
I'm accustomed to using vectors to dynamically add an element. However, vectors does not seem to want to handle an array of objects.
So, my main goal is having an array of objects and then being able to add an element to the end of the array to take another object.
EDIT**
Sorry, its the pushback() that causes me the problems.
class classex
{
private:
int i;
public:
classex() { }
void exmethod()
{
cin >> i;
}
};
void main()
{
vector <classex> vectorarray;
cout << vectorarray.size();
cout << vectorarray.push_back();
}
Now I know push_back must have an argument, but What argument?
Now I know push_back must have an argument, but What argument?
The argument is the thing that you want to append to the vector. What could be simpler or more expected?
BTW, you really, really, really do not want exmethod as an actual method of classex in 99% of cases. That's not how classes work. Gathering the information to create an instance is not part of the class's job. The class just creates the instance from that information.
Arrays are fixed sized containers. So enlarging them is not possible. You work around this and copy one array in a bigger and gain space behind the old end, but that's it.
You can create a array larger than you currently need it and remember which elements are empty. Of course they are never empty (they at least contain 0's), but that's a different story.
Like arrays, there are many containers, some are able to grow, like the stl containers: lists, vectors, deques, sets and so on.
add a Constructor to set i (just to give your example a real world touch) to your example classex, like this:
class classex {
public:
classex(int& v) : i(v) {}
private:
int i;
};
An example for a growing container looks like this:
vector <classex> c; // c for container
// c is empty now. c.size() == 0
c.push_back(classex(1));
c.push_back(classex(2));
c.push_back(classex(3));
// c.size() == 3
EDIT: The question was how to add an element to an array dynamically allocated, but the OP actually mean std::vector. Below the separator is my original answer.
std::vector<int> v;
v.push_back( 5 ); // 5 is added to the back of v.
You could always use C's realloc and free. EDIT: (Assuming your objects are PODs.)
When compared to the requirement of manually allocating, copying, and reallocating using new and delete, it's a wonder Stroustrup didn't add a keyword like renew.

Does the array key determine array size in C++?

im storing some settings for objects in an array. the id's of objects are used as the key. the id's start from 100000 and go up. if i was to input data for an object with id 100 000, would cpp automatical create 99999 blank key entries starting from 0?
Array size is determined when you create an array.
To access object at index 100 000 you need to have array of at least that size, which answers your question.
If the array is smaller you will access memory at
array begin address + (index*object
size)
which is not a good thing. E.g. the following will print some data but it is a data that are stored at that point in memory and it's outside of your array (not a good thing):
string arr[3];
cout << arr[5] << endl;
Assuming you are talking about standard array like:
string arr[10];
Array's size is specified when you compile it, for example you can't do:
string arr[]; // this will fail to compile, no size specified
But you do:
string arr[] = {"1","2","3"}; // array size is 3
string arr1[3]; // array size is 3
string arr2[3] = {"1"}; // array size is 3
If you want to map extra parameters for object you are better off using std::map like:
class person {};
std::map<person*,int> PersonScore;
This assumes that the additional parameters are not logically part of the object otherwise you would just add them to the object.
Maybe you want somthing along the lines of:
class ArrayPlus100k {
Item underlyingArray[NUM_ELEMENTS];
public:
Item& operator [] (int i) { return underlyingArray[i-100000]; }
// etc.
}
If you truely mean an array, and by key you mean index, then subtracting 100,000 from your index will provide you with a zero based array index. There will be no unused entries.
There may be a better container than a flat array. Choosing the right data structure depends on what you are trying to do. If you are storing objects using a key, you might want to use a std::map<key, value>.
What happens depends entirely on the data structure you choose to use. If you use a map, only the items you insert will take up space in memory. If you use new to allocate an actual array, then you will want to allocate only enough space for for the items you want to store. In that case, adjust your index by subtracting 100,000.
No, it will not create 0-99999, but rather start from 100000 to your array size.
For example, if you declare the following:
int arr[5];
Starting from arr[2], you can store up to arr[7].
I hope you understand...