Memory allocation in OS

Memory allocation in OS - c++

Having this simple code in C++:
#include <iostream>
#include <memory>
#include <vector>
using namespace std;
class Empty{};
int main() {
array<unique_ptr<Empty>, 1024> empties;
for(size_t i =0; i < 1024; i++){
empties[i] = make_unique<Empty>();
}
for(auto& element : empties){
cout << "ptr: " << element.get() << endl;
}
return 0;
}
when running in ideone.com or Windows we get following result:
ptr: 0x2b601f0c9ca0
ptr: 0x2b601f0c9cc0
ptr: 0x2b601f0c9ce0
ptr: 0x2b601f0c9d00
ptr: 0x2b601f0c9d20
ptr: 0x2b601f0c9d40
ptr: 0x2b601f0c9d60 ...
For me it's totally strange. what kind of allocation algorithms in OS or standard library might cause, that there happened no allocation at address, that ends with number different than 0?
The reason I did this experiment is, that given the OS uses buddy algorithm, that manages pages and allocation request will cause OS to allocate a piece of contiguous memory, then a couple of next allocations (until running out of allocated memory) should be allocated quite nearby. If that would be the case , then probably cache issue with lists would not be so significant in some cases , but I got results I was not expecting anyways.
Also second number from the right in allocated nodes is totally randomly. What may cause such a behavior?

The minimum size of a class in C++ is one byte, if I recall correctly. As there is a consistent 32 byte spacing between the class, it could be that that happens to be the sizeof the empty class you made. To determine this, try adding
std::cout << "Empty class size: " << sizeof(Empty) << std::endl;
It probably won't be 32 bytes, instead, there will probably be some consistent spacing between each object.
Note:
Does this compile for you. It doesn't for me because empties cant be implicitly initialised.

Please notice that the printed pointers is inaccurate, the OS allow you to see them as subsequent pointers when they could be allocated to an entirely different pages.

Related

C++ valarray vs array size allocation

I am allocating a multidimensional valarray of size 2000x2000 and it is working smoothly.
valarray<valarray<int>> D(valarray<int>(-2,2000), 2000);
D[1999][1999] = 2000;
However, if I try to allocate a normal array and access an element, I get segmentation fault.
int A[2000][2000];
A[1999][1999] = 2000;
Both are on the stack, why this difference ?

Like std::vector, the underlying storage of std::valarray is dynamic, and the size of the object that manages this storage does not depend on the number of elements.
This program:
#include <iostream>
#include<valarray>
int main() {
std::cout << "sizeof(std::valarray<std::valarray<int>>): "
<< sizeof(std::valarray<std::valarray<int>>) << std::endl;
std::cout << "sizeof(int[2000][2000]): " << sizeof(int[2000][2000]) << std::endl;
}
produces this output for me:
sizeof(std::valarray<std::valarray<int>>): 16
sizeof(int[2000][2000]): 16000000
If you were to use std::array instead, you would have problems, though.

Both are on the stack, why this difference ?
Because std::valarray is much, much smaller object. Its size is entirely implementation defined, but in a particular standard library implementation that I looked at, it was the size of two pointers.
By contrast, the size of the 2d array A is more than 15 megabytes assuming a 4 byte int. The space available for automatic objects (shared among all of them) is typically much less than that in typical language implementations.

The dynamic memory allocation is hidden inside of the constructor of the valarray class and still uses new or malloc in the end.
Actually valarray is not on stack. This is why construction of array overflow but valarray doesn’t .

how big are 1000 double?

I am learning c++ and in one lesson was a code which is about exception handling. The code is not mine, it is just the example for "try and catch". So this question is not about the code quality
My question to this code is actually: is the output and calculation of the memory size correct?
When I allocate a block of memory with new double(1000), isn't the size then 8000 bytes ?
The cerr output only counts as 1kB instead of 8kB. Am I wrong?
I got the size of 1 double with sizeof(double) to confirm it is 8 bytes.
#include <iostream>
#include <cstdlib>
#include <new>
using namespace ::std;
int main()
{
int i = 0;
double *q;
try
{
while (1)
{
q = new double[1000];
i++;
}
}
catch (bad_alloc &ex)
{
cerr << "The memory is used up. " << i
<< " Kilobyte were available." << endl;
exit(1);
}
}

To summarize what #Peter said in his comment: Your i variable is counting the number of allocations, not the total amount of memory allocated.
Note, however, that even if you "fix" this, what you'll get is not the amount of "available memory", nor even the amount of available memory rounded down to a multiple of 8000. This is because "available memory" is not a very well-defined concept. The operating system may be willing to let you allocate a zillion bytes; but it might not actually do anything visible to other processes until you start writing into that memory. And even if you do write to it - it could swap unused memory pages to the hard disk / SSD, to make room for the pages you're working on.
If you wanted to check what the maximum amount of memory you can allocate using new, you might consider using a binary-search-like procedure to obtain the size; I won't spell it out in case that's your homework. (And of course, this too will not be accurate since other processes' memory use fluctuates.)
Also consider reading: How to get available memory C++/g++?
Finally, some nitpicks:
You're using inconsistent indentation. That's confusing.
i is not such a good name for a variable. num_allocations would fit better. When you use a more meaningful name you also commit to its semantics, which makes it more difficult to get them mixed up.
Try to avoid "magic numbers" like 1000. Define constants using enum or constexpr. For example: enum { Kilo = 1000 };.
There doesn't seem to be a good reason to use double in such a program - which has nothing to do with floating-point arithmetic.

You are absolutely correct. It should be:
cerr << "The memory is used up. " << sizeof(double) * i
<< " Kilobyte were available." << endl;

Meaning behind memory surrounding array c++

I've been lately experimenting with dynamically allocated arrays. I got to conclusion that they have to store their own size in order to free the memory.
So I dug a little in memory with pointers and found that 6*4 bytes directly before and 1*4 bytes directly after array don't change upon recompilation (aren't random garbage).
I represented these as unsigned int types and printed them out in win console:
Here's what I got:
(array's content is between fdfdfdfd uints in representation)
So I figured out that third unsigned int directly before the array's first element is the size of allocated memory in bytes.
However I cannot find any information about rest of them.
Q: Does anyone know what the memory surrounding array's content means and care to share?
The code used in program:
#include <iostream>
void show(unsigned long val[], int n)
{
using namespace std;
cout << "Array length: " << n <<endl;
cout << "hex: ";
for (int i = -6; i < n + 1; i++)
{
cout << hex << (*(val + i)) << "|";
}
cout << endl << "dec: ";
for (int i = -6; i < n + 1; i++)
{
cout << dec << (*(val + i)) << "|";
}
cout << endl;
}
int main()
{
using namespace std;
unsigned long *a = new unsigned long[15]{ 0,1,2,3,4,5,6,7,8,9,10,11,12,13,14 };
unsigned long *b = new unsigned long[15]{ 0 };
unsigned long *c = new unsigned long[17]{ 0 };
show(a, 15);
cout << endl;
show(b, 15);
cout << endl;
show(c, 17);
cout << endl;
cout << endl;
system("PAUSE");
delete[] a;
delete[] b;
delete[] c;
}

It typically means that you carried out your experiments using a debugging configuration of the project and debugging version of the standard library. That version of the library uses some pre-defined bit-patterns to mark the boundaries of each allocated memory block ("no man's land" areas). Later, it checks if these bit-patterns survived intact (e.g. at the moment of delete[]). If they did not, it implies that someone wrote beyond the boundaries of the memory block. Debug version of the library will issue a diagnostic message about the problem.
If you compile your test program in release (optimized) configuration with release (optimized) version of the standard library, these "no man's land" areas will not be created, these bit-patterns will disappear from memory and the associated memory checks will disappear from the code.
Note also the the memory layout you observed is typically specific for arrays of objects with no destructors or with trivial destructors (which is basically the same thing). In your case you were working with plain unsigned long.
Once you start allocating arrays of objects with non-trivial destructors, you will observe that it is not just the size of memory block (in bytes) that's stored by the implementation, but the exact size of the array (in elements) is typically stored there as well.

"I got to conclusion that they have to store their own size in order to free the memory." No they don't.
Array does not free it's memory. You never get an array from new/malloc. You get a pointer to memory under which you can store an array, but if you forget size you have requested you cannot get it back. The standard library often does depend on OS under the hood as well.
And even OS does not have to remember it. There are implementations with very simple memory management which basically returns you current pointer to the free memory, and move the pointer by the requested size. free does nothing and freed memory is forgotten.
Bottom line, memory management is implementation defined, and outside of what you get nothing is guaranteed. Compiler or OS can mess with it, so you need to look documentation specific for the environment.
Bit patterns that you talk about, are often used as safe guards, or used for debugging. E.g: When and why will an OS initialise memory to 0xCD, 0xDD, etc. on malloc/free/new/delete?

Pushing into Vector millions of objects = bad-alloc

I am compiling following code VS2012 - 32BIT. I know that max_size() returns
"maximum potential size the container can reach" which is in my case: 1.073.741.823 (yeay)
So how can i know, how many object my container can really store? (I have 64gb RAM)
unsigned int = 100000000;
vector<int*> data;
std::cout<<"max cap: "<<data.max_size()<<std::endl;
for(unsigned int i = 0; i < operationUnit; i++)
data.push_back(new int());
This will end-up in a bad-alloc. However, as i am targetting x64 this problem doesn't occur, as the max-cap is much higher, but i still cannot figure the exact elements, when i would like to reduce it down to clamp user-input.
thanks!

Well, it is OS dependent ofcourse, but the results would be, similar for every one. For example when run as 32bit executable, consistenly a build with VS2012 will stop at 26,906,977 elements in a vector of int*, not posing a threat to your memory (not even by close).
Now it gets interesting when you build a 64bit version, in which case, throwing a bad_alloc happens when (almost) all your memory is drained. In that case, no C++ not any other language can protect you.
In the screenshot that follows I'm posting an example of this happening: by the time bad_alloc gets thrown, the program is in no position to catch it or do anything with it. The OS steps in and kills every process and memory is deallocated at once (see graph). In the respective 32 version the exception was caught normally and deallocation would take about 10 minutes.
Now this is a very simplistic way of seeing this, I'm sure OS gurus could supply more insights but feel free to try this at home (and burn out some memory - I can't stop thinking that I can smell something burnt after this)
the code in the screenshot
#include <iostream>
#include <vector>
#include <exception>
using namespace std;
int main()
{
vector<int*> maxV;
try
{
while (true) maxV.push_back(new int);
}
catch (bad_alloc &e)
{
cout << "I caught bad alloc at element no " << maxV.size() << "\n";
for (auto i : maxV) delete i;
}
catch (exception &e)
{
cout << "Some other exception happened at element no " << maxV.size() << "\n";
for (auto i : maxV) delete i;
}
return 0;
}

You can't. OS could totally run out of memory. You may find that using deque data structure can become larger before error than vector for huge amounts of data, as the data is not contiguous and so it is less effected by memory fragmentation, which can be significant when you end up allocating more than half your entire memory..

Why can I access an element I just erased from an stl vector in c++?

In this example, I create a vector with one integer in it and then I erase that integer from the vector. The size of the vector decreases, but the integer is still there! Why is the integer still there? How is it possible for a vector of size 0 to contain elements?
#include <vector>
#include <iostream>
using namespace std;
int main(int agrc, char* argv[])
{
vector<int> v;
v.push_back(450);
cout << "Before" << endl;
cout << "Size: " << v.size() << endl;
cout << "First element: " << (*v.begin()) << endl;
v.erase(v.begin());
cout << "After" << endl;
cout << "Size: " << v.size() << endl;
cout << "First element: " << *(v.begin()) << endl;
return(0);
}
output:
Before
Size: 1
First element: 450
After
Size: 0
First element: 450

You are invoking undefined behavior by dereferencing an invalid memory location. Normally, the heap manager will not immediately free the memory deleted using delete for efficiency purposes. However, that doesn't mean that you can access that memory location, heap manager can use this memory location for other purposes whenever it likes. So your program will behave unpredictably if you dereference a invalid memory location.

IIRC a vector doesn't release space unless specifically told to, so you're seeing an item which is still in its memory but not being tracked by the vector. This is part of the reason why you're supposed to check the size first (the other being that if you never assigned anything, you'll be dereferencing a garbage pointer).

To start, don't count on it being this way across all systems. How a vector works internally is completely implementation-dependent. By dereferencing an invalid memory location, you're circumventing the behavior that has been outlined in the documentation.
That is to say, you can only count on behavior working that is outlined in the STL docs.
The reason you can still access that memory location is because that particular implementation you are using doesn't immediately delete memory, but keeps it around for awhile(probably for performance purposes). Another implementation could very well delete that memory immediately if the author so desired.

It is just that the vector has not freed the memory, but kept it around for future use.
This is what we call "undefined behaviour" There is no guarantee that it will work next time and it may easily crash the program on a future attempt. Don't do it.

What are your compiler options? I get a crash with the usual
options, with both of the compilers I regularly use (g++ and
VC++). In the case of g++, you have to set some additional
options (-D_GLIBCXX_DEBUG, I think) for this behavior; as far as
I can tell, it's the default for VC++. (My command for VC++ was
just "cl /EHs bounds.cc".)
As others have said, it's undefined behavior, but with a good
compiler, it will be defined to cause the program to crash.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js