How are vectors(in C++) having elements of variable size allocated? - c++

I wrote the following code to accept test-cases on a competetive programming website. It uses a vector input of the structure case to store the inputs for given test-cases all at once, and then process them one at a time( I have left out the loops that take the input and calculate the output because they are irrelevant to the question.)
#include<iostream>
#include<vector>
using namespace std;
struct case{
int n, m;
vector<int> jobsDone;
};
int main(){
int testCase;
cin>>testCase;
vector<case> input;
input.reserve(testCase);
//The rest of the code is supposed to be here
return 0;
}
As I was writing this code, I realised that the working of input.reserve(t) in such a case where the element size is variable(since each instance of the structure case also has a vector of variable size) would be difficult. Infact, even if I had not explicitly written the reserve() statement, the vector still would have reserved a minumum number of elemtns.
For this particular situation, I have the following questions regarding the vector input:
Wouldn't random access in O(1) time be impossible in this case, since the beginning position of every element is not known?
How would the vector input manage element access at all when the beginning location of every element cannot be calculated? Will it pad all the entries to the size of the maximum entry?
Should I rather be implementing cases using a vector of pointers pointing to each instance of case? I am thinking about this because if the vector pads each element to a size and wastes space, or it maintains the location to each element, and random access is not constant in time, hence there is no use for a vector anyway.

Every object type has a fixed size. This is what sizeof returns. A vector itself typically holds a pointer to the array of objects, the number of objects for which space has been allocated, and the number of objects actually contained. The size of these three things is independent of the number of elements in the vector.
For example, a vector<int> might contain:
1) An int * holding the address of the data.
2) A size_t holding the number of objects we've allocated space for
3) A size_t holding the number of objects contained in the vector.
This will probably be somewhere around 24 bytes, regardless of how many objects are in the vector. And this is what sizeof(vector<int>) will return.

Related

Array vs Vector for runtime of application

I am trying to decide whether i should be using an array or a vector for my case which i will describe below:
Consider this scenario: You need to store 3 different types of elements: an integer (this is an integer from 1-4 which represents an output), a boolean (this represents whether the output is true or false), and a second integer (range 0-3 which represents the state). The maximum number of any of these values that can be created is 72.
The way i have achieved this is by creating 3 separate arrays that get filled when an object, containing the above information, is created. Likewise, the object specific information is delete when the object is destroyed. This information is needed from the start of when the application is executed until the application is closed (assuming the object is not destroyed).
Since it is rare to have 72 of these objects created in my application (the use case is highly unlikely), im not sure whether it is smart (as far as the memory is concerned) to initialize these global arrays to 72 from the start, or use a vector and have it grow as the objects are created.
So, my question is, given the above scenario, is it better to use arrays or vectors in that scenario?
Note: One thing to keep in mind is that the index of my arrays represent the specific order of which the objects were created in, making it an easy way to keep track of my elements. I.e., the information at index 0 of any of the three arrays is the information for object 1, etc... Would i be able to keep this same indexing system for reference with the vectors?
I would use std::vector because they will track how many items you have contained in them. I would also pre-allocate enough for all 72 items using the reserve() method to prevent allocation more than once.
Also I would make a struct with your 3 values and have one vector of those structs.
struct item
{
int output; // could use std::int8_t to reduce memory
bool valid;
int state; // same here
// (or even use bitfields for all 3 values)
};
// ...
int main()
{
std::vector<item> items;
items.reserve(72); // preallocate enough for all items if you like
// ... etc...
};
If you are really worried about memory you can use bitfields to cram your struct into a single byte:
struct item
{
unsigned char output: 2; // 0-3 (need to add 1 to get 1-4)
unsigned char valid: 1;
unsigned char state: 2; // 0-3
};
Both options are valid.
The only thing with arrays is that you need to track how far the array is filled with valid entries, wherein you can ask the vector for its size.

Memory fragmentation using std list?

I'm using list of lists to store points data in my appliation.
Here some examples test I made:
//using list of lists
list<list<Point>> ls;
for(int i=0;i<10000;++i)
{
list<Point> lp;
lp.resize(4);
lp.pushback(Point(1,2));
ls.push_back(lp);
}
I asume that memory used will be
10k elements * 5 Points * Point size = 10000*5*2*4=400.000 bytes + some overhead of list container, but memory used by programm rises dramatically.
Is it due to overhead of list container or maybe because of memory fragmentation?
EDIT:
add some info and another example
Point is mfc CPoint class or you can define your own point class with int x,y , I'm using VS2008 in debug mode, Win XP, and Window Task Manager to view memory of application
I can't use vector instead of outer list because I don't know total size N of it beforehand, so I must push_back every new entry.
here is modified example
int N=10000;
list<vector<CPoint>> ls;
for(int i=0;i<N;++i)
{
vector<CPoint> vp;
vp.resize(5);
vp.reserve(5);
ls.push_back(vp);
}
and I compare it to
CPoint* p= new CPoint[N*5];
It's not "+ some overhead of list container". List overhead is linear with the number of objects, not constant. There's 50,000 Points, but with each Point you also have two pointers (std::list is doubly-linked), and also with each element in ls, you have two pointers. Plus, each list is going to have a head and tail pointer.
So that's 140,002 (I think) extra pointers that your math doesn't account for. Note that this dwarfs the size of the Point objects themselves, since they're so small. You sure that list is the right container for you? vector has constant overhead - basically three pointer per container, which would be just 30,003 additional pointers on top of just the Point objects. That's a large memory savings - if that is something that matters.
[Update based on Bill Lynch's comment] vector could allocate more space than 5 for your points. Worst-case, it will allocate twice as much space as you need. But since sizeof(Point) == sizeof(Point*) for you, that's still strictly better than list since list will always use three times as much space.

Heap_size in heap_sort

I'm reading Cormen's "Introduction to Algorithms", and I'm trying to implement a heap-sort, and there's one thing I continually fail to understand: how do we calculate the heap_size for a given array?
My textbook says
An array A that represents a heap is an object with two attributes:
A.length, which (as usual) gives the number of elements in the array,
and A.heap-size, which represents how many elements in the heap are
stored within array A. That is, although A[1 .. A.length] may contain
numbers, only the elements in A[1..A.heap-size],where 0 <= A.heap-size <=
A.length, are valid elements of the heap.
If I implement an array as std::vector<T> Arr, then its' size would be Arr.size, but what would its' heap_size be is currently beyond me.
The heap size should be a separately stored variable, which you manage yourself.
Whenever you remove from or add to the heap, you should decrement or increment the value appropriately.
In C++, using a vector, you may actually be able to use the size, since the underlying representation is an array that's at least as big as the size of the vector, and it's guaranteed to stay the same size if you call resize with a smaller size. (So the underlying array will be the array size and the vector size will be the heap size).

inserting into the middle of an array

I have an array int *playerNum which stores the list of all the numbers of the players in the team. Each slot e.g playerNum[1]; represents a position on the team, if I wanted to add a new player for a new position on the team. That is, inserting a new element into the array somewhere near the middle, how would I go about doing this?
At the moment, I was thinking you memcpy up to the position you want to insert the player into a new array and then insert the new player and copy over the rest of it?
(I have to use an array)
If you're using C++, I would suggest not using memcpy or memmove but instead using the copy or copy_backward algorithms. These will work on any data type, not just plain old integers, and most implementations are optimized enough that they will compile down to memmove anyway. More importantly, they will work even if you change the underlying type of the elements in the array to something that needs a custom copy constructor or assignment operator.
If you have to use an array, after having made sure you have enough storage (using realloc if necessary), use memmove to shift the items from the insertion point to the end by one position, then save your new player at the desired location.
You can't use memcpy if the source and target areas overlap.
This will fail as soon as the objects in your array have non-trivial copy-constructors, and it's not idiomatic C++. Using one of the container classes is much safer (std::vector or std::list for instance).
Your solution using memcpy is correct (under few assumptions mentionned by other).
However, and since you are programming in C++. It is probably a better choice to use std::vector and its insert method.
vector<int> myvector (3,100);
myvector.insert ( 10 , 42 );
An array takes a contiguous block of memory, there is no function for you to insert an element in the middle. you can create a new one of size larger than the origin's by one then copy the original array into the new one plus the new member
for(int i=0;i<arSize/2;i++)
{
newarray[i]<-ar[i];
}
newarray[i+1]<-newelemant;
for(int j=i+1<newSize;j++,i++)
{
newarray[i]<-ar[i];
}
if you use STL, ting becomes easier, use list.
As you're talking about an array and "insert" I assume that it is a sorted array. You don't necessarily need a second array provided that the capacity N of your existing array is large enough to store more entries (N>n, where n is the number of current entries). You can move the entries from k to n-1 (zero-indexed) to k+1 to n, where k is the desired insert position. Insert the new element at index position k and increase n by one. If the array is not large enough in the beginning, you can follow your proposed approach or just reallocate a new array of larger capacity N' and copy the existing data before applying the actual insert operation described above.
BTW: As you're using C++, you could easily use std::vector.
While it is possible to use arrays for this, C++ has a better solutions to offer. For starters, try std::vector, which is a decent enough general-purpose container, based on a dynamically-allocated array. It behaves exactly like an array in many cases.
Looking at your problem, however, there are two downsides to arrays or vectors:
Indices have to be 0-based and contiguous; you cannot remove elements from the middle without losing key/value associations for everything after the removed element; so if you remove the player on position 4, then the player from position 9 will move to position 8
Random insertion and deletion (that is, anywhere except the end) is expensive - O(n), that is, execution time grows linearly with array size. This is because every time you insert or delete, a part of the array needs to be moved.
If the key/value thing isn't important to you, and insertion/deletion isn't time critical, and your container is never going to be really large, then by all means, use a vector. If you need random insertion/deletion performance, but the key/value thing isn't important, look at std::list (although you won't get random access then, that is, the [] operator isn't defined, as implementing it would be very inefficient for linked lists; linked lists are also very memory hungry, with an overhead of two pointers per element). If you want to maintain key/value associations, std::map is your friend.
Losting the tail:
#include <stdio.h>
#define s 10
int L[s];
void insert(int v, int p, int *a)
{
memmove(a+p+1,a+p,(s-p+1)*4);
*(a+p) = v;
}
int main()
{
for(int i=0;i<s;i++) L[i] = i;
insert(11,6, L);
for(int i=0;i<s;i++) printf("%d %d\n", L[i], &L[i]);
return 0;
}

Does the array key determine array size in C++?

im storing some settings for objects in an array. the id's of objects are used as the key. the id's start from 100000 and go up. if i was to input data for an object with id 100 000, would cpp automatical create 99999 blank key entries starting from 0?
Array size is determined when you create an array.
To access object at index 100 000 you need to have array of at least that size, which answers your question.
If the array is smaller you will access memory at
array begin address + (index*object
size)
which is not a good thing. E.g. the following will print some data but it is a data that are stored at that point in memory and it's outside of your array (not a good thing):
string arr[3];
cout << arr[5] << endl;
Assuming you are talking about standard array like:
string arr[10];
Array's size is specified when you compile it, for example you can't do:
string arr[]; // this will fail to compile, no size specified
But you do:
string arr[] = {"1","2","3"}; // array size is 3
string arr1[3]; // array size is 3
string arr2[3] = {"1"}; // array size is 3
If you want to map extra parameters for object you are better off using std::map like:
class person {};
std::map<person*,int> PersonScore;
This assumes that the additional parameters are not logically part of the object otherwise you would just add them to the object.
Maybe you want somthing along the lines of:
class ArrayPlus100k {
Item underlyingArray[NUM_ELEMENTS];
public:
Item& operator [] (int i) { return underlyingArray[i-100000]; }
// etc.
}
If you truely mean an array, and by key you mean index, then subtracting 100,000 from your index will provide you with a zero based array index. There will be no unused entries.
There may be a better container than a flat array. Choosing the right data structure depends on what you are trying to do. If you are storing objects using a key, you might want to use a std::map<key, value>.
What happens depends entirely on the data structure you choose to use. If you use a map, only the items you insert will take up space in memory. If you use new to allocate an actual array, then you will want to allocate only enough space for for the items you want to store. In that case, adjust your index by subtracting 100,000.
No, it will not create 0-99999, but rather start from 100000 to your array size.
For example, if you declare the following:
int arr[5];
Starting from arr[2], you can store up to arr[7].
I hope you understand...