Array vs Vector for runtime of application

Array vs Vector for runtime of application - c++

I am trying to decide whether i should be using an array or a vector for my case which i will describe below:
Consider this scenario: You need to store 3 different types of elements: an integer (this is an integer from 1-4 which represents an output), a boolean (this represents whether the output is true or false), and a second integer (range 0-3 which represents the state). The maximum number of any of these values that can be created is 72.
The way i have achieved this is by creating 3 separate arrays that get filled when an object, containing the above information, is created. Likewise, the object specific information is delete when the object is destroyed. This information is needed from the start of when the application is executed until the application is closed (assuming the object is not destroyed).
Since it is rare to have 72 of these objects created in my application (the use case is highly unlikely), im not sure whether it is smart (as far as the memory is concerned) to initialize these global arrays to 72 from the start, or use a vector and have it grow as the objects are created.
So, my question is, given the above scenario, is it better to use arrays or vectors in that scenario?
Note: One thing to keep in mind is that the index of my arrays represent the specific order of which the objects were created in, making it an easy way to keep track of my elements. I.e., the information at index 0 of any of the three arrays is the information for object 1, etc... Would i be able to keep this same indexing system for reference with the vectors?

I would use std::vector because they will track how many items you have contained in them. I would also pre-allocate enough for all 72 items using the reserve() method to prevent allocation more than once.
Also I would make a struct with your 3 values and have one vector of those structs.
struct item
{
int output; // could use std::int8_t to reduce memory
bool valid;
int state; // same here
// (or even use bitfields for all 3 values)
};
// ...
int main()
{
std::vector<item> items;
items.reserve(72); // preallocate enough for all items if you like
// ... etc...
};
If you are really worried about memory you can use bitfields to cram your struct into a single byte:
struct item
{
unsigned char output: 2; // 0-3 (need to add 1 to get 1-4)
unsigned char valid: 1;
unsigned char state: 2; // 0-3
};

Both options are valid.
The only thing with arrays is that you need to track how far the array is filled with valid entries, wherein you can ask the vector for its size.

Related

Find End of Array Declared as Struct Type C++

I was recently learning to use struct datatype in c++. I know how the basics of struct datatype work and how to manipulate its variables. But I was wondering how would I determine the end of struct datatype array. For example consider the code below:
struct PersonDetails
{
string name, address;
int age, number;
}
Now in c++ program I create an array of struct type as follows:
PersonDetails Data[500];
Now consider that I have 30 records in data array and I have to display these records by looping through data array's index. So how would I determine that I have to loop through only first 30 indexes as the data is only stored in these indexes. As in char array we compare all indexes with '\0' to determine the end of array. Then what method will we use for Data[] array?
An edit that I have no idea about Vectors and the project i am working on requires me to use basics of c++(functions, control structures, loops, etc.).

It's not feasible.
For char[], back in times of C standardization, developers agreed to use \0 (integer value 0) as a special character marking end-of-string. Everything works as long as everyone is following this convention (i.e. both standard library functions and developers using those functions).
If you wanted to have such a convention for your type, you could just write down "Data object with both strings empty and both ints equal to 0 is array terminator", but you would have to follow this convention. You'd have to write functions that would stop processing array upon finding such an object. You'd have to make sure that in every array there is at least one such object.
Instead
You should use std::vector<Data> which can automatically accomodate for any number of Data objects and will now precisely how many of them are currently stored (using size() method)
or
use std::array<Data, 30>, which can store exactly 30 objects and you can assume all of them are valid objects.

IMHO the correct way to solve this is to not use a C-style array, but instead use a std::array or std::vector that knows it's .size().
Iterating a std::vector or std::array is trivial:
for (const auto& element : Data_array) {
// Do something with the array element
}
See also:
https://en.cppreference.com/w/cpp/container/array
https://en.cppreference.com/w/cpp/container/vector
https://en.cppreference.com/w/cpp/language/for
https://en.cppreference.com/w/cpp/language/range-for

The simplest solution is to just have a separate variable specifying how many array elements are filled in.
PersonDetails Data[500];
int numPersons = 0;
Data[0].name = ... ;
Data[0].address = ...;
Data[0].age = ...;
Data[0].number = ...;
numPersons = 1;
Data[1].name = ... ;
Data[1].address = ...;
Data[1].age = ...;
Data[1].number = ...;
numPersons = 2;
...
Then you use that variable when looping through the array.
for (int i = 0; i < numPersons; ++i)
{
// use Data[i] as needed...
}

I don't really agree using std::array makes any difference.
The problem you currently have doesn't occur in whether we have such an element in the container, but whether the element we are inspecting useful.
Consider the example you gave, for an array of chars, we simply check whether one of the elements is \0 to decide whether or not we should halt the iteration.
How does that work? The ramaining elements, of course, default initialized to be \0, they exist, but of no use.
Similarly, you can check, in this example, whether
name.empty()
Or, in order to avoid any possible exception, as mentioned in the comment section, do this:
add user-defined constructor to the class ( or struct, they are same actually.) which initialize age to -1 and then check if age == -1.
because it's impossible for a people not having any name, that means, you have not assign to any of the remaining elements. Thus, stop iteration.
As a supplement, using std::vector makes sense, but if that isn't a option for you for the time being, you don't need to consider it.

Memory management when using vector

I am making a game engine and need to use the std::vector container for all of the components and entities in the game.
In a script the user might need to hold a pointer to an entity or component, perhaps to continuously check some kind of state. If something is added to the vector that the pointer points to and the capacity is exceeded, it is my understanding that the vector will allocate new memory and every pointer that points to any element in the vector will become invalid.
Considering this issue i have a couple of possible solutions. After each push_back to the vector, would it be a viable to check if a current capacity variable is exceeded by the actual capacity of the vector? And if so, fetch and overwrite the old pointers to the new ones? Would this guarantee to "catch" every case that invalidates pointers when performing a push_back?
Another solution that i've found is to instead save an index to the element and access it that way, but i suspect that is bad for performance when you need to continuously check the state of that element (every 1/60 second).
I am aware that other containers do not have this issue but i'd really like to make it work with a vector. Also it might be worth noting that i do not know in advance how many entities / components there will be.
Any input is greatly appreciated.

You shouldn't worry about performance of std::vector when you access its element only 60 times per second. By the way, in Release compilation mode std::vector::operator[] is being converted to a single lea opcode. In Debug mode it is decorated by some runtime range checks though.

If the user is going to store pointers to the objects, why even contain them in a vector?
I don't feel like it is a good idea to (poor wording)->store pointers to objects in a vector. (what I meant is to create pointers that point to vector elements, i.e. my_ptr = &my_vec[n];) The whole point of a container is to reference the contents in the normal ways that the container supports, not to create outside pointers to elements of the container.
To answer your question about whether you can detect the allocations, yes you could, but it is still probably a bad idea to reference the contents of a vector by pointers to elements.
You could also reserve space in the vector when you create it, if you have some idea of what the maximum size might grow to. Then it would never resize.
edit:
After reading other responses, and thinking about what you asked, another thought occurred. If your vector is a vector of pointers to objects, and you pass out the pointers to the objects to your clients, resizing the vector does not invalidate the pointers that the vector hold. The issue becomes keeping track of the life of the object (who owns it), which is why using shared_ptr would be useful.
For example:
vector<shared_ptr> my_vec;
my_vec.push_back(stuff);
if you pass out the pointers contained in the vector to clients...
client_ptr = my_vec[3];
There will be no problem when the vector resizes. The contents of the vector will be preserved, and whatever was at my_vec[3] will still be there. The object pointed to by my_vec[3] will still be at the same address, and my_vec[3] will still contain that address. Whomever got a copy of the pointer at my_vec[3] will still have a valid pointer.
However, if you did this:
client_ptr = &my_vec[3];
And the client is dereferencing like this:
*client_ptr->whatever();
You have a problem. Now when my_vec resized, &my_vec[3] is probably no longer valid, thus client_ptr points to nowhere.

If something is added to the vector that the pointer points to and the
capacity is exceeded, it is my understanding that the vector will
allocate new memory and every pointer that points to any element in
the vector will become invalid.
I once wrote some code to analyze what happens when a vector's capacity is exceeded. (Have you done this, yet?) What that code demonstrated on my Ubuntu with g++v5 system was that std::vector code simply a) doubles the capacity, b) moves all the elements from old to the new storage, then c) cleans up the old. Perhaps your implementation is similar. I think the details of capacity expansion is implementation dependent.
And yes, any pointer into the vector would be invalidated when push_back() causes capacity to be exceeded.
1) I simply don't use pointers-into-the-vector (and neither should you). In this way the issue is completely eliminated, as it simply can not occur. (see also, dangling pointers) The proper way to access a std::vector (or a std::array) element is to use an index (via the operator[]() method).
After any capacity-expansion, the index of all elements at indexes less than the previous capacity limit are still valid, as the push_back() installed the new element at the 'end' (I think highest memory addressed.) The elements memory location may have changed, but the element index is still the same.
2) It is my practice that I simply don't exceed the capacity. Yes, by that I mean that I have been able to formulate all my problems such that I know the required maximum-capacity. I have never found this approach to be a problem.
3) If the vector contents can not be contained in system memory (my system's best upper limit capacity is roughly 3.5 GBytes), then perhaps a vector container (or any ram based container) is inappropriate. You will have to accomplish your goal using disk storage, perhaps with vector containers acting as a cache.
update 2017-July-31
Some code to consider from my latest Game of Life.
Each Cell_t (on the 2-d gameboard) has 8 neighbors.
In my implementation, each Cell_t has a neighbor 'list,' (either std::array or std::vector, I've tried both), and after the gameboard has fully constructed, each Cell_t's init() method is run, filling it's neighbor 'list'.
// see Cell_t data attributes
std::array<int, 8> m_neighbors;
// ...
void Cell_t::void init()
{
int i = 0;
m_neighbors[i] = validCellIndx(m_row-1, m_col-1); // 1 - up left
m_neighbors[++i] = validCellIndx(m_row-1, m_col); // 2 - up
m_neighbors[++i] = validCellIndx(m_row-1, m_col+1); // 3 - up right
m_neighbors[++i] = validCellIndx(m_row, m_col+1); // 4 - right
m_neighbors[++i] = validCellIndx(m_row+1, m_col+1); // 5 - down right
m_neighbors[++i] = validCellIndx(m_row+1, m_col); // 6 - down
m_neighbors[++i] = validCellIndx(m_row+1, m_col-1); // 7 - down left
m_neighbors[++i] = validCellIndx(m_row, m_col-1); // 8 - left
// ^^^^^^^^^^^^^- returns info to quickly find cell
}
The int value in m_neighbors[i] is the index into the gameboard vector. To determine the next state of the cell, the code 'counts the neighbor's states.'
Note - Some cells are at the edge of the gameboard ... in this implementation, validCellIndx() can return a value indicating 'no-neighbor', (above top row, left of left edge, etc.)
// multiplier: for 100x200 cells,20,000 * m_generation => ~20,000,000 ops
void countNeighbors(int& aliveNeighbors, int& totalNeighbors)
{
{ /* ... initialize m_count[]s to 0 */ }
for(auto neighborIndx : m_neighbors ) { // each of 8 neighbors // 123
if(no_neighbor != neighborIndx) // 8-4
m_count[ gBoard[neighborIndx].m_state ] += 1; // 765
}
aliveNeighbors = m_count[ CellALIVE ]; // CellDEAD = 1, CellALIVE
totalNeighbors = aliveNeighbors + m_count [ CellDEAD ];
} // Cell_Arr_t::countNeighbors
init() pre-computes the index to this cells neighbors. The m_neighbors array holds index integers, not pointers. It is trivial to have NO pointers-into-the-gameboard vector.

How are vectors(in C++) having elements of variable size allocated?

I wrote the following code to accept test-cases on a competetive programming website. It uses a vector input of the structure case to store the inputs for given test-cases all at once, and then process them one at a time( I have left out the loops that take the input and calculate the output because they are irrelevant to the question.)
#include<iostream>
#include<vector>
using namespace std;
struct case{
int n, m;
vector<int> jobsDone;
};
int main(){
int testCase;
cin>>testCase;
vector<case> input;
input.reserve(testCase);
//The rest of the code is supposed to be here
return 0;
}
As I was writing this code, I realised that the working of input.reserve(t) in such a case where the element size is variable(since each instance of the structure case also has a vector of variable size) would be difficult. Infact, even if I had not explicitly written the reserve() statement, the vector still would have reserved a minumum number of elemtns.
For this particular situation, I have the following questions regarding the vector input:
Wouldn't random access in O(1) time be impossible in this case, since the beginning position of every element is not known?
How would the vector input manage element access at all when the beginning location of every element cannot be calculated? Will it pad all the entries to the size of the maximum entry?
Should I rather be implementing cases using a vector of pointers pointing to each instance of case? I am thinking about this because if the vector pads each element to a size and wastes space, or it maintains the location to each element, and random access is not constant in time, hence there is no use for a vector anyway.

Every object type has a fixed size. This is what sizeof returns. A vector itself typically holds a pointer to the array of objects, the number of objects for which space has been allocated, and the number of objects actually contained. The size of these three things is independent of the number of elements in the vector.
For example, a vector<int> might contain:
1) An int * holding the address of the data.
2) A size_t holding the number of objects we've allocated space for
3) A size_t holding the number of objects contained in the vector.
This will probably be somewhere around 24 bytes, regardless of how many objects are in the vector. And this is what sizeof(vector<int>) will return.

How can I allocate memory for a data structure that contains a vector?

If I have a struct instanceData:
struct InstanceData
{
unsigned usedInstances;
unsigned allocatedInstances;
void* buffer;
Entity* entity;
std::vector<float> *vertices;
};
And I allocate enough memory for an Entity and std::vector:
newData.buffer = size * (sizeof(Entity) + sizeof(std::vector<float>)); // Pseudo code
newData.entity = (Entity *)(newData.buffer);
newData.vertices = (std::vector<float> *)(newData.entity + size);
And then attempt to copy a vector of any size to it:
SetVertices(unsigned i, std::vector<float> vertices)
{
instanceData.vertices[i] = vertices;
}
I get an Access Violation Reading location error.
I've chopped up my code to make it concise, but it's based on Bitsquid's ECS. so just assume it works if I'm not dealing with vectors (it does). With this in mind, I'm assuming it's having issues because it doesn't know what size the vector is going to scale to. However, I thought the vectors might increase along another dimension, like this?:
Am I wrong? Either way, how can I allocate memory for a vector in a buffer like this?
And yes, I know vectors manage their own memory. That's besides the point. I'm trying to do something different.

It looks like you want InstanceData.buffer to have the actual memory space which is allocated/deallocated/accessed by other things. The entity and vertices pointers then point into this space. But by trying to use std::vector, you are mixing up two completely incompatible approaches.
1) You can do this with the language and the standard library, which means no raw pointers, no "new", no "sizeof".
struct Point {float x; float y;} // usually this is int, not float
struct InstanceData {
Entity entity;
std::vector<Point> vertices;
}
This is the way I would recommend. If you need to output to a specific binary format for serialization, just handle that in the save method.
2) You can manage the memory internal to the class, using oldschool C, which means using N*sizeof(float) for the vertices. Since this will be extremely error prone for a new programmer (and still rough for vets), you must make all of this private to class InstanceData, and do not allow any code outside InstanceData to manage them. Use unit tests. Provide public getter functions. I've done stuff like this for data structures that go across the network, or when reading/writing files with a specified format (Tiff, pgp, z39.50). But just to store in memory using difficult data structures -- no way.
Some other questions you asked:
How do I allocate memory for std::vector?
You don't. The vector allocates its own memory, and manages it. You can tell it to resize() or reserve() space, or push_back, but it will handle it. Look at http://en.cppreference.com/w/cpp/container/vector
How do I allocate memory for a vector [sic] in a buffer like this?
You seem to be thinking of an array. You're way off with your pseudo code so far, so you really need to work your way up through a tutorial. You have to allocate with "new". I could post some starter code for this, if you really need, which I would edit into the answer here.
Also, you said something about vector increasing along another dimension. Vectors are one dimensional. You can make a vector of vectors, but let's not get into that.
edit addendum:
The basic idea with a megabuffer is that you allocate all the required space in the buffer, then you initialize the values, then you use it through the getters.
The data layout is "Header, Entity1, Entity2, ..., EntityN"
// I did not check this code in a compiler, sorry, need to get to work soon
MegaBuffer::MegaBuffer() {AllocateBuffer(0);}
MegaBuffer::~MegaBuffer() {ReleaseBuffer();}
MegaBuffer::AllocateBuffer(size_t size /*, whatever is needed for the header*/){
if (nullptr!=buffer)
ReleaseBuffer();
size_t total_bytes = sizeof(Header) + count * sizeof(Entity)
buffer = new unsigned char [total_bytes];
header = buffer;
// need to set up the header
header->count = 0;
header->allocated = size;
// set up internal pointer
entity = buffer + sizeof(Header);
}
MegaBuffer::ReleaseBuffer(){
delete [] buffer;
}
Entity* MegaBuffer::operator[](int n) {return entity[n];}
The header is always a fixed size, and appears exactly once, and tells you how many entities you have. In your case there's no header because you are using member variables "usedInstances" and "allocatednstances" instead. So you do sort of have a header but it is not part of the allocated buffer. But you don't want to allocate 0 bytes, so just set usedInstances=0; allocatedInstances=0; buffer=nullptr;
I did not code for changing the size of the buffer, because the bitsquid ECS example covers that, but he doesn't show the first time initialization. Make sure you initialize n and allocated, and assign meaningful values for each entity before you use them.
You are not doing the bitsquid ECS the same as the link you posted. In that, he has several different objects of fixed size in parallel arrays. There is an entity, its mass, its position, etc. So entity[4] is an entity which has mass equal to "mass[4]" and its acceleration is "acceleration[4]". This uses pointer arithmetic to access array elements. (built in array, NOT std::Array, NOT std::vector)
The data layout is "Entity1, Entity2, ..., EntityN, mass1, mass2, ..., massN, position1, position2, ..., positionN, velocity1 ... " you get the idea.
If you read the article, you'll notice he says basically the same thing everyone else said about the standard library. You can use an std container to store each of these arrays, OR you can allocate one megabuffer and use pointers and "built in array" math to get to the exact memory location within that buffer for each item. In the classic faux-pas, he even says "This avoids any hidden overheads that might exist in the Array class and we only have a single allocation to keep track of." But you don't know if this is faster or slower than std::Array, and you're introducing a lot of bugs and extra development time dealing with raw pointers.

I think I see what you are trying to do.
There are numerous issues. First. You are making a buffer of random data, telling C++ that a Vector sized piece of it is a Vector. But, at no time do you actually call the constructor to Vector which will initialize the pointers and constructs inside to viable values.
This has already been answered here: Call a constructor on a already allocated memory
The second issue is the line
instanceData.vertices[i] = vertices;
instanceData.vertices is a pointer to a Vector, so you actually need to write
(*(instanceData.vertices))[i]
The third issue is that the contents of *(instanceData.vertices) are floats, and not Vector, so you should not be able to do the assignment there.

Does the array key determine array size in C++?

im storing some settings for objects in an array. the id's of objects are used as the key. the id's start from 100000 and go up. if i was to input data for an object with id 100 000, would cpp automatical create 99999 blank key entries starting from 0?

Array size is determined when you create an array.
To access object at index 100 000 you need to have array of at least that size, which answers your question.
If the array is smaller you will access memory at
array begin address + (index*object
size)
which is not a good thing. E.g. the following will print some data but it is a data that are stored at that point in memory and it's outside of your array (not a good thing):
string arr[3];
cout << arr[5] << endl;
Assuming you are talking about standard array like:
string arr[10];
Array's size is specified when you compile it, for example you can't do:
string arr[]; // this will fail to compile, no size specified
But you do:
string arr[] = {"1","2","3"}; // array size is 3
string arr1[3]; // array size is 3
string arr2[3] = {"1"}; // array size is 3
If you want to map extra parameters for object you are better off using std::map like:
class person {};
std::map<person*,int> PersonScore;
This assumes that the additional parameters are not logically part of the object otherwise you would just add them to the object.

Maybe you want somthing along the lines of:
class ArrayPlus100k {
Item underlyingArray[NUM_ELEMENTS];
public:
Item& operator [] (int i) { return underlyingArray[i-100000]; }
// etc.
}

If you truely mean an array, and by key you mean index, then subtracting 100,000 from your index will provide you with a zero based array index. There will be no unused entries.

There may be a better container than a flat array. Choosing the right data structure depends on what you are trying to do. If you are storing objects using a key, you might want to use a std::map<key, value>.
What happens depends entirely on the data structure you choose to use. If you use a map, only the items you insert will take up space in memory. If you use new to allocate an actual array, then you will want to allocate only enough space for for the items you want to store. In that case, adjust your index by subtracting 100,000.

No, it will not create 0-99999, but rather start from 100000 to your array size.
For example, if you declare the following:
int arr[5];
Starting from arr[2], you can store up to arr[7].
I hope you understand...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js