Is there a way to find out where my operating system is located in my RAM? - c++

I wondered if C or C++ has a way to find where the operating system operates in RAM and free that place. I know that I can use free() to free up memory place. I wonder if I can shut down my computer by freeing my operating system's RAM space.

Before protected memory was a thing you could just access any bit of memory using its physical address and manipulate it. This was how DOS and DOS-based Windows (pre Windows 95, like 3.1) worked.
Protected memory, or virtualized memory, means you can do things like swap out parts of memory to disk, in effect pretending to have more memory than the computer physically has. Chunks of memory can be swapped around as necessary, paged in and paged out, with the running program being none the wiser. These addresses are all virtual, or "fake" as in they don't physically exist, but as far as the CPU is concerned, they are real and work exactly as you'd expect, something accomplished by integrated Memory Management Unit (MMU) in the CPU.
After protected memory your "user space" program no longer sees physical memory addresses, but instead virtual addresses that the operating system itself manages. On Intel-type systems the kernel, the core of the operating system, runs within a special protection ring that prevents user programs from directly accessing or manipulating memory.
Any multi-user system must implement this kind of memory and kernel protection or there would be no way to prevent one user from accessing the memory of another user's processes.
Within the kernel there is no "malloc" or "free" in the conventional sense, the kernel has its own special allocation mechanisms. These are completely separate from the traditional malloc() and free() functions in the C standard library and are not in any way inter-compatible. Each kernel, be it Linux or BSD or Windows or otherwise, does this in a different way even if they can all support user-space code that uses the exact same malloc() function.
There should be no way that you can, through simple memory allocation calls, crash the system. If you can, congratulations, you've found an exploit and should document it and forward it to the appropriate parties for further analysis. Keep in mind this kind of thing is heavily researched so the likelihood of you discovering one by chance is very low. Competitions like pwn2own show just how much work is involved in bypassing all this security.
It's also important to remember that the operating system does not necessarily live in a fixed location. Address Space Layout Randomization is a technique to scramble the addresses of various functions and data to ensure that an exploit can't use hard-coded values. Before this was common you could predict where various things would live in memory and do blind manipulation through a tiny bug, but that's made much harder now as you must not only find an exploit to manipulate, but another to discover the address in the first place.
All that being said, there's nothing special about C or C++ in terms of "power" that makes it able to do things no other language can do. Any program that is able to bind against the operating system functions has the same equivalent "power" in terms of control. This includes Python, Perl, Ruby, Node.js, C# and long, long, list of others that can bind to C libraries and make arbitrary function calls.
People prototype "exploits" in whatever language is the most convenient, and often that's Perl or Python as often as C. It really depends on what you're trying to accomplish. Some bugs, once discovered, are so easy to reproduce you could do it with something as mundane as browser JavaScript, as was the case with Row Hammer.

You mention free() as a means to free memory which is correct but too simplified. Its counterparts malloc() and calloc() merely translate to a system call which requests the operating system for a chunk of memory. When you call free(), you relinquish ownership of the memory you asked for and return it to the operating system.
Your C/C++ program runs in a virtual address space which the operating system's memory management subsystem maps to actual RAM addresses. No matter what address you access, it can never be out of this virtual address space which is entirely under the control of the operating system.
A user application can never access the operating system's memory in case of modern operating systems. All memory it uses is granted to it by the operating system. The OS acts a bridge/abstraction between your user applications and hardware, that's their whole purpose, to prevent direct interaction with the hardware, in your case, RAM.
RAM was once upon a time directly accessible before the advent of virtual memory. It was exactly due to this vulnerability, along with the need to run programs larger than the system memory, that virtual memory was introduced.
The only way you can mess with the operating system in user space is to make system calls with malignant arguments.

Related

Allocating Memory to a Program Upon Initialization in C++?

I would like to allocate a set amount of memory for the program upon initialization so that other programs cannot steal memory from it. Essentially, I would like to create a Heap for my program (without having to program a heap module all for myself).
If this is not possible, can you please refer me to a heap module that I can import into my project?
Using C++17.
Edit: More specifically, I am trying to for example specify that it is only allowed to malloc 4MB of data for example. If it tries to allocate anymore, it should throw an error.
What you ask is not possible with the features provided by ISO C++.
However, on most common platforms, reserving physical RAM is possible using platform-specific extensions. For example, Linux provides the function mlock and Microsoft Windows provides the function VirtualLock. But, in order to use these functions, you must either
know which memory pages the default allocator is using for memory allocation, which can get messy and complicated, or
use your own implementation of a memory allocator, so that it can itself call mlock/VirtualLock whenever it receives memory from the operating system.
Your own implementation of a memory allocator could be as simple as forwarding all memory allocation request to the operating system's kernel, for example using mmap on Linux or VirtualAlloc on Windows. However, this has the disadvantage that the granularity of all memory allocation requests is the size of a memory page, which on most systems is at least 4096 bytes. This means that even very small memory allocation requests of a few bytes will actually take 4096 bytes of memory. This would be a big waste of memory. Also, in your question, you stated that you wanted to preallocate a certain amount of memory when you start your application, so that you can use that memory later to satisfy smaller memory allocation requests. This cannot be done using the method described above.
Therefore, you may want to consider using a "proper" memory allocator implementation, which is able to satisfy several smaller allocation requests using a single memory page. See this list on Wikipedia for a list of common implementations.
That said, what you describe may be an XY problem, depending on what operating system you are using. For example, in contrast to Windows, Linux will typically overcommit memory. This means that the Linux kernel will allow applications to allocate more memory than is actually available, on the assumption that most applications will not use all the memory they request. Therefore, a call to std::malloc or new will seldom fail on Linux (but it is still possible, depending on the configuration). Instead, under low memory conditions, the Linux OOM killer (out of memory killer) will start killing processes that are taking up large amounts of memory, in order to free up memory and to keep the system running.
For this reason, the methods described above are likely to work on Microsoft Windows, but on Linux, they could be counterproductive, as they would make your process more likely to fall prey to the OOM killer.
However, even if you are able to accomplish what you want using the methods described above, I generally don't recommend that you use these methods, as this behavior is unfair towards the other processes in the system. Generally, you should leave the task of deciding which process gets (fast) physical memory and which process gets (slow) swap space to the operating system, as the operating system can do a better job of fairly distributing its resources among its processes.
If you want to force actual allocation of memory pages to your process, there's no way around managing your own memory.
In C++, the canonical way to do this would be to write an implementation for operator new() and operator delete() (the global ones!) which are responsible to perform the actual memory allocation. The function signatures are:
void* operator new (size_t size);
void operator delete (void *pointer);
and you'll need to include the #include <new> header.
Your implementation can do its work via one of three possible routes:
It allocates the memory using the C function malloc(), and immediately touches each memory page by writing a value to it. This forces the system kernel to actually back the memory region with real memory.
It allocates the memory using malloc(), and proceeds to call mlockall(). This is the nuclear option for when you absolutely must avoid all paging, including paging of code segments and shared libraries.
It asks the kernel directly for some chunks of memory using mmap() and proceeds to lock them into RAM via mlock(). The effect is similar to the previous option, but it is targeted only at the memory you allocated for your operator new() implementation.
The first method works independent of the OS kernel, the other two assume a Linux kernel.
With GCC, you can perform the memory allocation before main() is called by using the __attribute__((constructor)).
Writing such a memory allocator is not rocket science. It's not even a lot of code if done right. I once wrote an operator new()/operator delete() implementation that fits into 170 lines, including all my special features, comments, empty lines, and the license declaration. It's really not that hard.
I would like to allocate a set amount of memory for the program upon initialization so that other programs cannot steal memory from it.
Why would you want to do that?
it is not your business to decide if your program is more important than others !
Imagine your program running in parallel with some printing utility driving the printer. This is a common occurrence: I have downloaded some long PDF document (e.g. several hundred pages, like the C++ standard n3337), and I want to print it on paper to study it in a train, an airplane, at home and annotate it with a pencil and paper. The printing is likely to last more than an hour, and require computing resources (e.g. on Linux some CUPS printer driver converting PDF to PCL). During the printing, I could use your program.
If I am a user of your program, you have decided (at my place) that printing that document is less important for me than using your program (while the printer is slowly spitting pages).
Leave the allocation and management of memory to the operating system of your user.
There are of course important exceptions to that common sense rule. A typical medical robot used in neurosurgery has some embedded software with constraints different of a web server software. See also this draft report. For Linux, read Advanced Linux Programming then syscalls(2).
More specifically, I am trying to for example specify that it is only allowed to malloc 4MB of data for example.
This is really simple. Some OSes provide the ability to limit resources (on Linux, see setrlimit(2)...). Write your own malloc routine, above operating system specific primitives such as (on Linux) mmap(2). See also this, this and that answers (all focused on Linux; adapt them to your particular operating system). You probably can find open source implementations of malloc (on github or gitlab) for your particular operating system. For Linux, look here, then study the source code of glibc or musl-libc. In C++, study the source code of GCC or Clang (probably ::operator new is using malloc)

What decides where on the heap memory is allocated?

Let me clear up: I understand how new and delete (and delete[]) work. I understand what the stack is, and I understand when to allocate memory on the stack and on the heap.
What I don't understand, however, is: where on the heap is memory allocated. I know we're supposed to look at the heap as this big pool of pretty much limitless RAM, but surely that's not the case.
What is in control of choosing where on the heap memory is stored and how does it choose that?
Also: the term "returning memory to the OS" is one I come across quite often. Does this mean that the heap is shared between all processes?
The reason I care about all this is because I want to learn more about memory fragmentation. I figured it'd be a good idea to know how the heap works before I learn how to deal with memory fragmentation, because I don't have enough experience with memory allocation, nor C++ to dive straight into that.
The memory is managed by the OS. So the answer depends on the OS/Plattform that is used. The C++ specification does not specify how memory on a lower level is allocated/freed, it specifies it in from of the lifetime.
While multi-user desktop/server/phone OS (like Windows, Linux, macOS, Android, …) have similar ways to how memory is managed, it could be completely different on embedded systems.
What is in control of choosing where on the heap memory is stored and how does it choose that?
Its the OS that is responsible for that. How exactly depends - as already said - on the OS. The OS could also be a thin layer in the form of a combination of the runtime library and minimal OS like includeos
Does this mean that the heap is shared between all processes?
Depends on the point of view. The address space is - for multiuser systems - in general not shared between processes. The OS ensures that one process cannot access memory of another process, which is ensured through virtual address spaces. But the OS can distribute the whole RAM among all processes.
For embedded systems, it could even be the case, that each process has a fixed amount a preallocated memory - that is not shared between processes - and with no way to allocated new memory or free memory. And then it is up to the developer to manage that preallocated memory by themselves by providing custom allocators to the objects of the stdlib, and to construct in allocated storage.
I want to learn more about memory fragmentation
There are two ways of fragmentation. The one is given by the memory addresses exposed by the OS to the C++ runtime. And the one on the hardware/OS side (which could be the same for embedded system) . How and in which form the memory might be fragmented organized by the OS can't be determined using the function provided by the stdlib. And how the fragmentation of the address spaces of the process behaves, depends again on the os and the also on the used stdlib.
None of these details are specified in the C++ specification standard. Each C++ implementation is free to implement these details in whichever way that works for it, as long as the end result is agreeable with the standard.
Each C++ compiler, and operating system implements these low level details in its own unique way. There is no specific answer to these questions that apply to every C++ compiler and every operating system. Over time, a lot of research has went into profiling and optimizing memory allocation and deallocation algorithms for a typical C++ application, and there are some tailored C++ implementation that offer alternative memory allocation algorithm that each application will choose, that it thinks will work best for it. Of course, none of this is covered by the C++ standard.
Of course, all memory in your computer must be shared between all processes that are running on it, and your operating system is responsible for divvying it up and parceling it out to all the processes, when they request more memory. All that "returning memory to the OS" means is that a process's memory allocator has determined that it no longer needs a sufficiently large continuous memory range, that was used before but not any more, and notifies the operating system that it no longer uses it and it can be reassigned to another process.
What decides where on the heap memory is allocated?
From the perspective of a C++ programmer: It is decided by the implementation (of the C++ language).
From the perspective of a C++ standard library implementer (as an example of what may hypothetically be true for some implementation): It is decided by malloc which is part of the C standard library.
From the perspective of malloc implementer (as an example of what may hypothetically be true for some implementation): The location of heap in general is decided by the operating system (for example, on Linux systems it might be whatever address is returned by sbrk). The location of any individual allocation is up to the implementer to decide as long as they stay within the limitations established by the operating system and the specification of the language.
Note that heap memory is called "free store" in C++. I think this is to avoid confusion with the heap data structure which is unrelated.
I understand what the stack is
Note that there is no such thing as "stack memory" in the C++ language. The fact that C++ implementations store automatic variables in such manner is an implementation detail.
The heap is indeed shared between processes, but in C++ the delete keyword does not return the memory to the operating system, but keeps it to reuse later on. The location of the allocated memory is dependent on how much memory you want to access, there has to be enough space and how the OS handles memory allocations, it can be one on first, best and worst hit (Read more on that topic on google). The name RAM basically tells you where to search for your memory :D
It is however possible to get the same memory location when you have a small program and restart it multiple times.

What would be the disadvantage of creating an array of really big size on 64 bit systems?

Operating systems like Linux work on the principle of Copy-on-write, so even if you are allocating an array of say 100 GB, but only use upto 10GB, you would only be using 10 GB of memory. So, what would be the disadvantage of creating such a big array? I can see an advantage though, which is that you won't have to worry about using a dynamic array, which would have the cost of reallocations.
The main disadvantage is that by doing this you are making a strong assumption about how exactly the standard library allocators1 and the underlying Linux allocators work. In fact, the allocators and underlying system do not always work as you mention.
Now, you mentioned "copy on write", but what you are likely really referring to is the combination of lazy page population and overcommit. Depending on the configuration, it means that any memory you allocate but don't touch may not count against memory limits and may not occupy physical memory.
The problem is that this often may not work. For example:
Many allocators have modes where they touch the allocated memory, e.g., in debug mode to fill it out with a known pattern to help diagnose references to uninitialized memory. Most allocators touch at least a few bytes before your allocated region to store metadata that can be used on deallocation. So you are making a strong assumption about allocator behavior that is likely to break.
The Linux overcommit behavior is totally configurable. In practice many server-side Linux users will disable it in order to reduce uncertainty and unrecoverable problems related to the OOM killer. So your claim that Linux behaves lazily is only true in for some overcommit configuration and false for others.
You might assume that memory is being committed in 4K chunks and adjust your algorithm around that. However, systems have different page sizes: 16K and 64K are not uncommon as base page sizes, and x86 Linux systems by default have transparent huge pages enabled, so you may actually be getting 2,048K pages without realizing it! In this case you may end up committing nearly the entire array, depending on your access pattern.
As mentioned in the comments, the "failure mode" for this type of use is pretty poor. You think you'll only use a small portion of the array, but if you do end up using more than the system can handle, at best you may get a signal to your application on some random access to a new page, but more like the oom killer will just kill some other random process on your machine.
1 Here I'm assuming you are using something like malloc or new to allocate the array, since you didn't mention mmaping it directly or anything.
Real-world operating systems don't simply allow your program to access all memory available - they enforce quotas. So a 64-bit operating system, running on hardware with enough physical memory, will simply refuse to allocate all that memory to any program. This is even more true if your operating system is virtualised (e.g. some hypervisor hosts two or more operating systems on the same physical platform - the hypervisor enforces quotas for each hosted operating system, and one of them will enforce quotas for your program).
Attempting to allocate a large amount of memory is therefore, practically, an effective way to maximise likelihood that the operating system will not allow your program the memory it needs.
While, yes, it is possible for an administrator to increase quotas, that has consequences as well. If you don't have administrative access, you need to convince an administrator to increase those quotas (which isn't necessarily easy unless your machine only has one user). A program that consumes a large amount of memory can cause other programs to be starved of memory - which becomes a problem if those other programs are needed by yourself or other people. In extreme cases, your program can starve the operating system itself of resources, which causes it and all programs it hosts to slow down, and compromises system stability. These sort of concerns are why systems enforce quotas in the first place - often by default.
There are also problems that can arise because operating systems can be configured to over-commit. Loosely speaking, this means that when a program requests memory, the operating system tells the program the allocation has succeeded, even if the operating system hasn't allocated it. Subsequently, when the program USES that memory (typically, writes data to it), the operating system is suddenly required to ACTUALLY make the memory available. If the operating system cannot do this for any reason, that becomes a problem for the program (which believes it has access to memory, but the operating system prevents access). This typically results in some error condition affecting program execution (and often results in program termination). While the problems associated with over-committing can affect any program, the odds are markedly increased when the program allocates larges amount of memory.

Does an OS lock the entire amount of ram inside your computer

I was wondering if for example. Windows completely lock all the available ram so that some really bored person with too much time on their hands cannot start deleting memory from another process (somehow).
The question originated from what was happening when using the delete function in C++ (was C++ telling the OS that the OS can now release the memory for overwriting or was C++ telling the hardware to unlock the memory)... and then spawned on to me thinking of specific hardware created to interface with the RAM at hardware level and start deleting memory chunks for the fun of it. ie a hacker perhaps.
My thoughts were: The Windows memory management program is told memory is free to be written to again, right? But does that also mean that the memory address is still set to locked at a hardware level so that the memory can only be taken control of by windows rather than another OS. Or is it like the wild west down at hardware level... If Windows isn't locking memory, anything else can use the part that is now free.
I guess the real question is, is there a hardware level lock on memory addresses that operating systems can trigger... so that the memory has locked itself down and cannot be re-assigned then?
I was wondering if Windows completely lock all the available ram
Windows, like any other operating system, uses all the available RAM.
so that some really bored person with too much time on their hands cannot start deleting memory from another process (somehow).
Non sequitur. It does it because that's what it's supposed to do. It's an operating system, and it is supposed to control all the hardware resources.
The question originated from what was happening when you mark memory for deletion in C++.
I don't know what 'mark memory for deletion in C++' means, but if you refer to the delete operator, or the free() function, they in general do not release memory to the operating system.
My thoughts were: The Windows memory management program is told memory is free to be written to again, right?
Wrong, see above.
But does that also mean that the memory address is still set to locked at a hardware level so that the memory can only be taken control of by windows rather than another OS.
What other OS? Unless you're in a virtual environment, there is no other OS, and even if you are, the virtual environment hands control over all the designated RAM to the guest operating system.
Or is it like the wild west down at hardware level... If Windows isn't locking memory, anything else can use the part that is now free.
Anything else such as?
I guess the real question is, is there a hardware level lock on memory addresses that operating systems can trigger?
In general yes, there are hardware definitions of what privilege level is required to access each memory segment. For example, the operating system's own memory is immune to appliucation processes, and application processes are immune to each other: but this is all highly hardware-dependent.
Your question doesn't really make much sense.
The concept you're looking for is mapping, not *locking.
The memory is just there. The OS does nothing special about that.
What it does is map chunks of it into individual processes. Each process can see only the memory that is mapped into its address space. Trying to access any other address just leads to an access violation (or segmentation fault on Unixes). There's just nothing at those addresses. Not "locked memory", just nothing.
And when the OS decides to (or when the process requests it), a page of memory can be unmapped from a given process again.
It's not locking though. The memory isn't "owned" by the process it is mapped to. And the same memory can be mapped into the address spaces of multiple processes at the same time. That's one way to exchange data between processes.
So the OS doesn't "lock" or control ownership of memory. It just controls whether a given chunk of memory is visible to any particular process.
It is not as simple as that, also Windows is not open-source, so exactly what it does may not be published. However all addresses in user space code are virtual and MMU protected - address X in one process does not refer to the same physical memory as address X in another, and one process cannot access that of another. An attempt to access memory outside of the address space of a process will cause an MMU exception.
I believe that when Windows starts a process, it has an initial heap allocation, from which dynamic memory allocation occurs. Deleting a dynamically allocated block simply returns it to the process's existing heap (not to the OS). If the current heap has insufficient memory, additional memory is requested from the OS to expand it.
Memory can be shared between processes in a controlled manner - in Windows this is done via a memory-mapped file, and uses the same virtual-memory mechanisms as the swap-file uses to emulate more memory that is physically available.
I think rather than asking a question on SO for this you'd do better to first do a little basic research, start at About Memory Management on MSDN for example.
With respect to external hardware accessing the memory it is possible to implement shared memory between processors (it is not that uncommon; for example see here for example), but it is not a matter of "wild-west" the mechanisms for doing so are implemented via the OS.
Even on conventional PC architectures, many devices access memory directly via DMA as a method of performing I/O without CPU overhead. Again this is controlled by the OS and not at all "wild west", but an errant device driver could bring down your system - which is why Microsoft have a testing and approvals process for drivers.
No, there isn't.
RAM is managed by software, RAM can't lock itself.
You asked the wrong question. There is no such thing as a hardware lock on memory. The trick is virtual memory management which makes only controlled amounts of memory available to a process. The OS controls all available memory, that's its job, and processes only ever see the memory that the OS has given to them.

Need to manage a slab of 'theoretical' memory

I need to manage memory that's in a separate memory space. Another program has a large slab of contiguous memory that my code cannot directly access, and notifies my code of its size during initialization. The other program will ask my code to "allocate" X bytes from the slab, and will later notify my code to deallocate the allocated blocks. The allocations and deallocations will be more or less unpredictable, much like the use of regular malloc and free. It's up to my code to manage the dynamic memory for the other process. The intended use has to do with device memory on a GPU, but I would rather not make the question specific to that. I'd like an answer that's generic, maybe even enough that I could theoretically use it as a back-end for a network API for managing virtual memory on a remote machine.
The basic functionality I need is to be able initialize the manager with the slab size, perform the equivalent of malloc() and free(), and get a few stats like remaining available memory and maybe maximum allocatable size.
Obviously, I can't use malloc() and free() 'abstractly'. I also don't want the management of my actual process memory to interfere with the management of my abstract slab.
What would be the best way to go about this? And, more specifically, does the standard library or Boost have this kind of a facility?
Notes:
Allocation strategy is not extremely important at this point. I mean, maybe it is, but first I need to figure out how I'm going to go about this business.
I'm going to be using this for allocating between tens and several thousand buffers per second, with different sizes; some as small as several bytes, some as large as gigabytes. There will be no buildup of allocated space over a long period of time.
a back-end for a network API for managing virtual memory on a remote machine
A networked system should not be concerned about the memory management of a remote node. The remote node should be free to choose whether to serve the request by allocating memory on the fastest but scarcest memory or in a larger but slower storage or a hybrid of both depending on its own heuristics, other requests happening concurrently, prioritization/security policy set by the network admin, and so on.
Thinking in terms of malloc and free will limit what a remote server can do to serve the requests efficiently.
Note that strictly speaking malloc itself is only managing "theoretical"/"virtual" memory. The OS kernel can page memory allocated with malloc in and out of swap space and assign or move its physical address on the MMU. The memory "address" returned by "malloc" is actually just a virtual address which doesn't correspond to actual physical address.
Unlike networked system, when managing hardware you do usually want to dictate very closely what the hardware should do on each step, because, presumably, a hardware is only connected to a single system, so any contention is only local contention, and considerations regarding performance throughput usually trumps everything else. The nature of managing remote resource on a networked system and local hardware is similar but quite different.
If you're interested in designs for networked shared memory, you might want to checkout in-memory databases (e.g. Redis) or networked file systems. A networked shared memory can usually afford to use a user-friendly string keys to point to the memory slabs; hardware shared memory usually want to keep it simple and fast by using simple numeric handles.
Both networked and hardware need to deal with splitting a single resource for multiple independent processes securely. In many popular operating system, the code responsible for managing hardware allocation is usually the kernel. It's quite rare for userspace program to be the one responsible for managing hardware on monolithic OSes. Maybe you should be writing a device driver? Many GPGPU device now have an MMU.
Certainly, there's nothing preventing you from writing a gpu_malloc()/gpu_free(). It theoretically should be possible to mmap the remote system's address space into the process's virtual memory address space so that programs could just use the remote memory just as any other memory.
I'm going to be using this for allocating between tens and several thousand buffers per second, with different sizes; some as small as several bytes, some as large as gigabytes.
Generally, you don't want having thousands of small allocations. System calls that relates to memory management can be expensive if it involves remote systems, so it might be necessary to have the client program to allocate large slabs and do its own suballocation (malloc does this, small mallocs usually causes one system call to allocate larger memory than is requested and malloc then suballocates the slab).