Memory allocation function based on chunks - c++

I have to create a memory allocation program that is supposed to allocate memory blocks to a memory that contains 8 contiguous memory positions. N is the size of the memory, M is the maximum number of positions a program can request and X is the number of memory positions requested. The memory is divided in M chunks, each equally made of N/M memory positions. N/M must be greater than M, otherwise a program will not have enough memory allocated to a chunk. The first element of POS stores the position (in the memory) where the first available block of 1 memory position is located, the second element stores the position (in memory) where the first available block of 2 memory positions is located and so on. If the ith element in the array POS stores number -1, it means that there are no blocks of (i+1) memory positions available in the ith chunk. If POS[X-1] stores a value different from -1, it means there is space to allocate a block of X. In that case, the value stored in POS[X-1] is returned. Otherwise, the value -1 is returned.
From what I understand of the question, it wants me to create a memory allocation program that separates 8 memory positions into chunks. Each chunk is required to be about as big as M, which is the largest number of memory positions a program can request. So M is 3 in the question so each chunk should be about 3 memory positions, meaning two chunks of 3 and one chunk of 2. Now the first chunk is allocated only for requests that only need 1 memory position, second chunk for requests that need 2 contiguous memory positions and third chunk for requests that need 3 memory positions. Example using 16 memory positions. Here's what I have so far.
#include <iostream>
using namespace std;
int search(int arr[], int n, int m, int x)
{
int POS[m];
if(POS[x-1]!=-1){
return POS[x-1];
} return -1;
}
int main(void)
{
int arr[] = { -1, -1, -1, -1, -1, -1, -1, -1 }; //example array, -1 represents available memory space
int x = 2; //example number of memory positions requested
int n = sizeof(arr) /sizeof(arr[0]); //size of array
int m = 3; //maximum number of positions a program can request
int result = search(arr, n, m, x);
if(result == -1) {
cout << "No slots available";
} else {
cout << x << " slots available at index " << result;;
}
}

To rephrase a little, it sounds like your question is essentially that there is an array of N ints where -1 represents an "available" element, and the goal is to find the position of X contiguous available element. I'm ignoring M and the POS array.
In other words, find a run of X or more -1 values. Here is an outline to do that with an O(N) search:
Initialize count to zero.
Make a loop, for i = 0, 1, ..., N - 1:
If arr[i] is available, increment the count. If the count is now equal to X, return (i - X) as the starting position of the X contiguous available elements.
Otherwise, reset the count to zero.
Edit: Based on your comment and example image, it sounds like the point of M-sized chunks is that each chunk of memory may contain only one allocation. This approach of course wastes some elements when allocations are smaller than M, but does make the allocation search easier.
Outline:
For each chunk...
Check whether the first arr element of the chunk is unused. If it so, then the whole chunk is available by the "one allocation per chunk" assumption.
To cover an edge case, if this is the last chunk, check that its size is at least X.
If both of these conditions are true, return the current chunk.

Related

Is the size of a vector simply the size of the sum of its parts?

For example, a float is 4 bytes. Does this mean that a vector containing 10 floats is exactly 40 bytes?
One may interpret the "size" of a vector in different ways:
vector::size() gives the number of elements currently stored in
the vector, regardless of the memory they allocate. So
myVector.size() in your case would be 10
vector:capacity() gives you the number of the elements for which the vector has allocated memory so far. Note that a vector, when continuously appending elements, allocates memory in "chunks", i.e. it might reserve space for 50 new elements at once in order to avoid repeated memory allocs. So capacity() >= size() is always true.
sizeof(myVector) would give you the size of the data structure vector necessary for managing a dynamically increasing series of elements. It typically contains dynamically allocated memory, which is not reflected by sizeof, such that sizeof is of less use in most cases.
See the following code:
int main() {
vector<float> fv;
for (int i=0; i<10; i++) {
fv.push_back(1.0);
}
cout << "size(): " << fv.size() << endl;
cout << "capacity(): " << fv.capacity() << endl;
cout << "sizeof(fv): " << sizeof(fv) << endl;
return 0;
}
Output:
size(): 10
capacity(): 16
sizeof(fv): 24
Hope it helps.
A std::vector has 3 different sizes. There is the size of the vector object itself and you get that with size(std::vector<some_type>). This size isn't really useful for anything though. Typically this will be the size of three pointers as that is how vector is typically implemented.
The second size is what is returned from the size() member function. The value returned from this is the number of elements in the vector. This is the "size" most people use when talking about the size of the vector.
The last size a vector has is the total number of elements it has allocated currently. That is obtained from using the capacity member function.
So a vector holding 10 floats needs to use at least 10 floats worth of memory but it could be using more as the capacity() is allowed to be grater then its size(). But size of the vector object itself (sizeof(name_of_vector)) will be some value and that value will never change no matter how many elements you add into the vector.

2d std::vector Contiguous Memory?

Consider the following code, which allocates a 2d std::vector<std::vector<int> > of size 5 x 3 and prints the memory addresses of each element:
#include <iostream>
#include <vector>
int main() {
int n = 5, m = 3;
std::vector<std::vector<int> >vec (n, std::vector<int>(m));
for (int i = 0; i < n; ++i) {
for (int j = 0; j < m; ++j) {
std::cout << &(vec[i][j]) << " ";
}
std::cout << "\n";
}
}
Output:
0x71ecc0 0x71ecc4 0x71ecc8
0x71ece0 0x71ece4 0x71ece8
0x71ed00 0x71ed04 0x71ed08
0x71ed20 0x71ed24 0x71ed28
0x71ed40 0x71ed44 0x71ed48
Of course, for a fixed row, each column is contiguous in memory, but the rows are not. In particular, each row is 32 bytes past the start of the previous row, but since each row is only 12 bytes, this leaves a gap of 20 bytes. For example, since I thought vectors allocate contiguous memory, I would have expected the first address of the second row to be 0x71eccc. Why is this not the case, and how does vector decide how much of a gap to give?
The overhead size of a vector is not 0. You have 24 bytes between the last element of your vector and the first element of the next vector. Try this:
cout << sizeof(std::vector<int>) << endl;
You will find the output to be 24 (Likely for your implementation of std::vector and compiler etc). Live example which happens to match.
You can imagine the vector layout like this:
If you want your elements to actually be contiguous then you need to do:
Use a std::array<std::array<int>> for no overhead (c++11 only). Note that this cannot be resized.
Use std::vector<int> and the formula row * numRows + col to access the element for row, col.
Agree with Mr. Fox's results and solutions1. Disagree with the logic used to get there.
In order to be resizable, a std::vector contains a dynamically allocated block of contiguous memory, so the outer vector has a pointer to a block of storage that contains N vectors that are contiguous. However, each of these inner vectors contains a pointer to its own block of storage. The likelihood (without some really groovy custom allocator voodoo) of all N blocks of storage being contiguous is astonishingly small. The overhead of the N vectors is contiguous. The data almost certainly not, but could be.
1 Mostly. The std::arrays in a std::vector will be contiguous, but nothing prevents the implementor of std::array from putting in additional state information that prevents the arrays in the std::arrays from being contiguous. An implementor who wrote the std::array in such a way is an occupant of an odd headspace or solving a very interesting problem.

C++ vector performance with predefined capacity

There is 2 ways to define std::vector(that i know of):
std::vector<int> vectorOne;
and
std::vector<int> vectorTwo(300);
So if i don't define the first and fill it with 300 int's then it has to reallocate memory to store those int's. that would mean it would not for example be address 0x0 through 0x300 but there could be memory allocated inbetween because it has to be reallocated after, but the second vector will already have those addresses reserved for them so there would be no space inbetween.
Does this affect perfomance at all and how could I meassure this?
std::vector is guaranteed to always store its data in a continuous block of memory. That means that when you add items, it has to try and increase its range of memory in use. If something else is in the memory following the vector, it needs to find a free block of the right size somewhere else in memory and copy all the old data + the new data to it.
This is a fairly expensive operation in terms of time, so it tries to mitigate by allocating a slightly larger block than what you need. This allows you to add several items before the whole reallocate-and-move-operation takes place.
Vector has two properties: size and capacity. The former is how many elements it actually holds, the latter is how many places are reserved in total. For example, if you have a vector with size() == 10 and capacity() == 18, it means you can add 8 more elements before it needs to reallocate.
How and when the capacity increases exactly, is up to the implementer of your STL version. You can test what happens on your computer with the following test:
#include <iostream>
#include <vector>
int main() {
using std::cout;
using std::vector;
// Create a vector with values 1 .. 10
vector<int> v(10);
std::cout << "v has size " << v.size() << " and capacity " << v.capacity() << "\n";
// Now add 90 values, and print the size and capacity after each insert
for(int i = 11; i <= 100; ++i)
{
v.push_back(i);
std::cout << "v has size " << v.size() << " and capacity " << v.capacity()
<< ". Memory range: " << &v.front() << " -- " << &v.back() << "\n";
}
return 0;
}
I ran it on IDEOne and got the following output:
v has size 10 and capacity 10
v has size 11 and capacity 20. Memory range: 0x9899a40 -- 0x9899a68
v has size 12 and capacity 20. Memory range: 0x9899a40 -- 0x9899a6c
v has size 13 and capacity 20. Memory range: 0x9899a40 -- 0x9899a70
...
v has size 20 and capacity 20. Memory range: 0x9899a40 -- 0x9899a8c
v has size 21 and capacity 40. Memory range: 0x9899a98 -- 0x9899ae8
...
v has size 40 and capacity 40. Memory range: 0x9899a98 -- 0x9899b34
v has size 41 and capacity 80. Memory range: 0x9899b40 -- 0x9899be0
You see the capacity increase and re-allocations happening right there, and you also see that this particular compiler chooses to double the capacity every time you hit the limit.
On some systems the algorithm will be more subtle, growing faster as you insert more items (so if your vector is small, you waste little space, but if it notices you insert a lot of items into it, it allocates more to avoid having to increase the capacity too often).
PS: Note the difference between setting the size and the capacity of a vector.
vector<int> v(10);
will create a vector with capacity at least 10, and size() == 10. If you print the contents of v, you will see that it contains
0 0 0 0 0 0 0 0 0 0
i.e. 10 integers with their default values. The next element you push into it, may (and likely will) cause a re-allocation. On the other hand,
vector<int> v();
v.reserve(10);
will create an empty vector, but with its initial capacity set to 10 rather than the default (probably 1). You can be certain that the first 10 elements you push into it will not cause an allocation (and the one probably will but not necessarily, as reserve may actually set the capacity to more than what you requested).
You should use reserve() method:
std::vector<int> vec;
vec.reserve(300);
assert(vec.size() == 0); // But memory is allocated
This solves the problem.
In your example it affects the performance greatly. You can expect, that when you overflow the vector, it doubles the allocated memory. So, if you push_back() into vector N times (and you haven't called "reserve()"), you can expect O(logN) reallocations, each of them causing copying of all values. So the total complexity is expected to be O(N*logN), although it is not specified by C++ standard.
The differences can be dramatic because if data is not adjacent in memory, the data may have to be fetched from main memory which is 200 times slower than a l1 cache fetch. This will not happen in a vector because data in a vector is required to be adjacent.
see https://www.youtube.com/watch?v=YQs6IC-vgmo
Use std::vector::reserve, when you can to avoid realloc events. The C++ 'chrono' header has good time utilities, to measure the time difference, in high resolution ticks.

Calculating size of vector of vectors in bytes

typedef vector<vector<short>> Mshort;
typedef vector<vector<int>> Mint;
Mshort mshort(1 << 20, vector<short>(20, -1)); // Xcode shows 73MB
Mint mint(1 << 20, vector<int>(20, -1)); // Xcode shows 105MB
short uses 2 bytes and int 4 bytes; please note that 1 << 20 = 2^20;
I am trying to calculate ahead (on paper) usage of memory but I am unable to.
sizeof(vector<>) // = 24 //no matter what type
sizeof(int) // = 4
sizeof(short) // = 2
I do not understand: mint should be double the mshort but it isn't. When running program only with mshort initialisation Xcode shows 73MB of memory usage; for mint 105MB;
mshort.size() * mshort[0].size() * sizeof(short) * sizeof(vector<short>) // = 1006632960
mint.size() * min[0].size() * sizeof(int) * sizeof(vector<int>) // = 2013265920
//no need to use .capacity() because I fill vectors with -1
1006632960 * 2 = 2013265920
How does one calculate how much space of RAM will 2d std::vector use or 2d std::array use.
I know the sizes ahead and each row has same number of columns.
The memory usage of your vectors of vectors will be e.g.
// the size of the data...
mshort.size() * mshort[0].size() * sizeof(short) +
// the size of the inner vector objects...
mshort.size() * sizeof mshort[0] +
// the size of the outer vector object...
// (this is ostensibly on the stack, given your code)
sizeof mshort +
// dynamic allocation overheads
overheads
The dynamic allocation overheads are because the vectors internally new memory for the elements they're to store, and for speed reasons they may have pools of fixed-sized memory areas waiting for new requests, so if the vector effectively does a new short[20] - with the data needing 40 bytes - it might end up with e.g. 48 or 64. The implementation may actually need to use some extra memory to store the array size, though for short and int there's no need to loop over the elements invoking destructors during delete[], so a good implementation will avoid that allocation and no-op destruction behaviour.
The actual data elements for any given vector are contiguous in memory though, so if you want to reduce the overheads, you can change your code to use fewer, larger vectors. For example, using one vector with (1 << 20) * 20 will have negligible overhead - then rather than accessing [i][j] you can access [i * 20 + j] - you can write a simple class wrapping the vector to do this for you, most simply with a v(i, j) notation...
inline short& operator()(size_t i, size_t j) { return v_[i * 20 + j]; }
inline short operator()(size_t i, size_t j) const { return v_[i * 20 + j]; }
...though you could support v[i][j] by having v.operator[] return a proxy object that can be further indexed with []. I'm sure if you search SO for questions on multi-dimension arrays there'll be some examples - think I may have posted such code myself once.
The main reason to want vector<vector<x>> is when the inner vectors vary in length.
Assuming glibc malloc:
Each memory chunk will allocate additional 8-16 bytes(2 size_t) for memory block header. For 64 bit system it would be 16 bytes.
see code:
https://github.com/sploitfun/lsploits/blob/master/glibc/malloc/malloc.c#L1110
chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if allocated | |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |M|P|
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... .
. .
. (malloc_usable_size() bytes) .
. |
It gives me approximately 83886080 for short when adding 16 bytes per row.
26+16+ mshort.size(1048576) * (mshort[0].size(20)*sizeof(short(2)) + sizeof(vector(26))+header(16))
It gives me approximately 125829120 for int.
But then I recompute you numbers and it look like you are on 32 bit...
short 75497472 that is ~73M
long 117440512 that is ~112M
Looks very close to reported ones.
Use capacity not size to get #items number, even if those are the same in your case.
Allocating single vector size row*columns will save you header*1048576 bytes.
Your calculation mshort.size() * mshort[0].size() * sizeof(short) * sizeof(vector<short>) // = 1006632960 is simply wrong. As your calculation, mshort takes 1006632960 which is 960MiB, which is not true.
Let's ignore libc's overhead, and just focus on std::vector<>'s size:
mshort is a vector of 1^20 items, each is vector<short> with 20 items.
So the size shall be:
mshort.size() * mshort[0].size() * sizeof(short) // Size of all short values
+ mshort.size() * sizeof(vector<short>) // Size of 1^20 vector<short>
+ sizeof(mshort) // Size of mshort itself, which can be ignored as overhead
The calculated size is 64MiB.
The same to mint, where the calculated size is 104MiB.
So mint is simply NOT double size of mshort.

I don't understand how to create and use dynamic arrays in C++

Okay so I have;
int grid_x = 5
int * grid;
grid = new int[grid_x];
*grid = 34;
cout << grid[0];
Should line 3 create an array with 5 elements? Or fill the first element with the number 5?
Line 4 fills the first element, how do I fill the rest?
Without line 4, line 5 reads "-842150451".
I don't understand what is going on, I'm trying to create a 2 dimensional array using x and y values specified by the user, and then fill each element one by one with numeric values also specified by the user. My above code was an attempt to try it out with a 1 dimensional array first.
The default C++ way of creating a dynamic(ally resizable) array of int is:
std::vector<int> grid;
Don't play around with unsafe pointers and manual dynamic allocation when the standard library already encapsulates this for you.
To create a vector of 5 elements, do this:
std::vector<int> grid(5);
You can then access its individual elements using []:
grid[0] = 34;
grid[1] = 42;
You can add new elements to the back:
// grid.size() is 5
grid.push_back(-42);
// grid.size() now returns 6
Consult reference docs to see all operations available on std::vector.
Should line 3 create an array with 5 elements?
Yes. It won't initialise them though, which is why you see a weird value.
Or fill the first element with the number 5?
new int(grid_x), with round brackets, would create a single object, not an array, and specify the initial value.
There's no way to allocate an array with new and initialise them with a (non-zero) value. You'll have to assign the values after allocation.
Line 4 fills the first element, how do I fill the rest?
You can use the subscript operator [] to access elements:
grid[0] = 34; // Equivalent to: *(grid) = 34
grid[1] = 42; // Equivalent to: *(grid+1) = 42
// ...
grid[4] = 77; // That's the last one: 5 elements from 0 to 4.
However, you usually don't want to juggle raw pointers like this; the burden of having to delete[] the array when you've finished with it can be difficult to fulfill. Instead, use the standard library. Here's one way to make a two-dimensional grid:
#include <vector>
std::vector<std::vector<int>> grid(grid_x, std::vector<int>(grid_y));
grid[x][y] = 42; // for any x is between 0 and grid_x-1, y between 0 and grid_y-1
Or might be more efficient to use a single contiguous array; you'll need your own little functions to access that as a two-dimenionsal grid. Something like this might be a good starting point:
template <typename T>
class Grid {
public:
Grid(size_t x, size_t y) : size_x(x), size_y(y), v(x*y) {}
T & operator()(size_t x, size_t y) {return v[y*size_x + x];}
T const & operator()(size_t x, size_t y) const {return v[y*size_x + x];}
private:
size_t size_x, size_y;
std::vector<T> v;
};
Grid grid(grid_x,grid_y);
grid(x,y) = 42;
Should line 3 create an array with 5 elements? Or fill the first element with the number 5?
Create an array with 5 elements.
Line 4 fills the first element, how do I fill the rest?
grid[n] = x;
Where n is the index of the element you want to set and x is the value.
Line 3 allocates memory for 5 integers side by side in memory so that they can be accessed and modified by...
The bracket operator, x[y] is exactly equivalent to *(x+y), so you could change Line 4 to grid[0] = 34; to make it more readable (this is why grid[2] will do the same thing as 2[grid]!)
An array is simply a contiguous block of memory. Therefore it has a starting address.
int * grid;
Is the C representation of the address of an integer, you can read the * as 'pointer'. Since your array is an array of integers, the address of the first element in the array is effectively the same as the address of the array. Hence line 3
grid = new int[grid_x];
allocates enough memory (on the heap) to hold the array and places its address in the grid variable. At this point the content of that memory is whatever it was when the physical silicon was last used. Reading from uninitialised memory will result in unpredictable values, hence your observation that leaving out line 4 results in strange output.
Remember that * pointer? On line four you can read it as 'the content of the pointer' and therefore
*grid = 34;
means set the content of the memory pointed to by grid to the value 34. But line 3 gave grid the address of the first element of the array. So line 4 sets the first element of the array to be 34.
In C, arrays use a zero-based index, which means that the first element of the array is number 0 and the last is number-of-elements-in-the-array - 1. So one way of filling the array is to index each element in turn to set a value to it.
for(int index = 0; index < grid_x; index++)
{
grid[index] = 34;
}
Alternatively, you could continue to use a pointer to do the same job.
for(int* pointerToElement = grid; 0 < grid_x; grid_x-- )
{
// save 34 to the address held by the pointer
/// and post-increment the pointer to the next element.
*pointerToElement++ = 34;
}
Have fun with arrays and pointers, they consistently provide a huge range of opportunities to spend sleepless hours wondering why your code doesn't work, PC reboots, router catches fire, etc, etc.
int grid_x = 5
int * grid;
grid = new int[grid_x];
*grid = 34;
cout << grid[0];
Should line 3 create an array with 5 elements? Or fill the first
element with the number 5?
Definitely the former. With the operator "new" you are allocating memory
Line 4 fills the first element, how do I fill the rest?
Use operator [], e.g.:
for int (i=0; i < grid_x; i++) { //Reset the buffer
grid[i] = 0;
}
Without line 4, line 5 reads "-842150451".
You are just reading uninitialized memory, it could be any value.
I don't understand what is going on, I'm trying to create a 2
dimensional array using x and y values specified by the user, and then
fill each element one by one with numeric values also specified by the
user. My above code was an attempt to try it out with a 1 dimensional
array first.
Other users explained how to use vectors. If you have to set only once the size of your array, I usually prefer boost::scoped_array which takes care of deleting when the variable goes out of scope.
For a two dimensional array of size not known at compile time, you need something a little bit trickier, like a scoped_array of scoped_arrays. Creating it will require necessarily a for loop, though.
using boost::scoped_array;
int grid_x;
int grid_y;
///Reading values from user...
scoped_array<scoped_array<int> > grid(new scoped_array<int> [grid_x]);
for (int i = 0; i < grid_x; i++)
grid[i] = scoped_array<int>(new int[grid_y] );
You will be able then to access your grid elements as
grid[x][y];
Note:
It would work also taking scoped_array out of the game,
typedef int* p_int_t;
p_int_t* grid = new p_int_t [grid_x];
for (int i = 0; i < grid_x; i++)
grid[i] = new int[grid_y];
but then you would have to take care of deletion at the end of the array's life, of ALL sub arrays.