How to allocate full memory pages

How to allocate full memory pages - c++

In C or C++, on Linux, I want to allocate heap memory in full pages of the system's memory page size.
(The purpose is that I want to increase the likelihood that harmful buffer overflows cause segmentation faults.)
When I allocate memory with C++ array new (pointer = new char[size]), where size is a multiple of sysconf(_SC_PAGESIZE), then still the (virtual) address the allocated memory will usually not be a multiple of sysconf(_SC_PAGESIZE), indicating that I've got a subset of a larger chunk, confirmed by the fact that writing to pointer[size] and a little bit beyond (forced buffer overflow) usually does not cause a segmentation fault.
My question here is, can I influence the memory allocation somehow to give me full memory pages.
The processor architecture I am interested in is x86_64 aka amd64. Operating system is either latest Ubuntu, or stable CentOS Linux (7.3), the latter comes with kernel 3.10 and gcc-4.8.
I do not care if the solution is in C or C++, therefore I ask to leave the C tag in this question.

1) Just switching from pointer = new char[size] to pointer = aligned_alloc(sysconf(_SC_PAGESIZE), size) resulted in proper page alignment and (so far, with small test programs) consistent generation of segmentation faults when exceeding the allocated range. As #JohnBollinger pointed out in his first comment to the question, the generation of segmentation faults is not guaranteed from the method of allocation alone. This can be fixed with 2):
2) The Linux man page for the function mprotect contains a complete example for restricting access to memory pages. The example also provides a signal handler for SIGSEGV, which I'm not interested in, default action (abort) is good enough for me. The example section from the man page follows. Note that applying mprotect to memory areas unrelated to mmap is a Linux-specific extension not covered by POSIX.
EXAMPLE
The program below allocates four pages of memory, makes the third
of these pages read-only, and then executes a loop that walks
upward through the allocated region modifying bytes.
An example of what we might see when running the program is the
following:
$ ./a.out
Start of region: 0x804c000
Got SIGSEGV at address: 0x804e000
Program source
#include <unistd.h>
#include <signal.h>
#include <stdio.h>
#include <malloc.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/mman.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
static char *buffer;
static void
handler(int sig, siginfo_t *si, void *unused)
{
printf("Got SIGSEGV at address: 0x%lx\n",
(long) si->si_addr);
exit(EXIT_FAILURE);
}
int
main(int argc, char *argv[])
{
char *p;
int pagesize;
struct sigaction sa;
sa.sa_flags = SA_SIGINFO;
sigemptyset(&sa.sa_mask);
sa.sa_sigaction = handler;
if (sigaction(SIGSEGV, &sa, NULL) == -1)
handle_error("sigaction");
pagesize = sysconf(_SC_PAGE_SIZE);
if (pagesize == -1)
handle_error("sysconf");
/* Allocate a buffer aligned on a page boundary;
initial protection is PROT_READ | PROT_WRITE */
buffer = memalign(pagesize, 4 * pagesize);
if (buffer == NULL)
handle_error("memalign");
printf("Start of region: 0x%lx\n", (long) buffer);
if (mprotect(buffer + pagesize * 2, pagesize,
PROT_READ) == -1)
handle_error("mprotect");
for (p = buffer ; ; )
*(p++) = 'a';
printf("Loop completed\n"); /* Should never happen */
exit(EXIT_SUCCESS);
}
Attribution of the preceding quote:
This page is part of release 4.04 of the Linux man-pages project.
A description of the project, information about reporting bugs, and
the latest version of this page, can be found at http://www.kernel.org/doc/man-pages/.

Related

32-bit malloc() return NULL when opening many threads?

I have a sample C++ program as below:
#include <windows.h>
#include <stdio.h>
int main(int argc, char* argv[])
{
void * pointerArr[20000];
int i = 0, j;
for (i = 0; i < 20000; i++) {
void * pointer = malloc(131125);
if (pointer == NULL) {
printf("i = %d, out of memory!\n", i);
getchar();
break;
}
pointerArr[i] = pointer;
}
for (j = 0; j < i; j++) {
free(pointerArr[j]);
}
getchar();
return 0;
}
When I run it with Visual Studio 32-bit Debug, it will run with following result:
The program can use nearly 2Gb of memory before out of memory.
This is normal behavior.
However, when I adding the code to start Thread inside the for loop as below:
#include <windows.h>
#include <stdio.h>
DWORD WINAPI thread_func(VOID* pInArgs)
{
Sleep(100000);
return 0;
}
int main(int argc, char* argv[])
{
void * pointerArr[20000];
int i = 0, j;
for (i = 0; i < 20000; i++) {
CreateThread(NULL, 0, thread_func, NULL, 0, NULL);
void * pointer = malloc(131125);
if (pointer == NULL) {
printf("i = %d, out of memory!\n", i);
getchar();
break;
}
pointerArr[i] = pointer;
}
for (j = 0; j < i; j++) {
free(pointerArr[j]);
}
getchar();
return 0;
}
The result is as below:
The memory is still just around 200Mb but function malloc will return NULL.
Could anyone help explain why the program cannot use the memory up to 2Gb before out of memory?
Is it mean creating many threads like above will cause memory leak?
In my real application, this error occur when I create about 800 threads, the RAM memory at the time "out of memory" is around 300Mb.

As noted in a comment by #macroland, the main thing happening here is that each thread is consuming 1 MiB for its stack (see MSDN CreateThread and Thread Stack Size). You say malloc returns NULL once the total you have directly allocated reaches 200 MB. Since you are allocating 131125 bytes at a time, that is 200 MB / 131125 B = 1525 threads. Their cumulative stack space will be around 1.5 GB. Adding the 200 MB of malloc memory is 1.7 GB, and miscellaneous overhead likely accounts for the rest.
So, why does Task Manager not show this? Because the full 1 MiB of thread stack space is not actually allocated (also called committed), rather it is reserved. See VirtualAlloc and the MEM_RESERVE flag. The address space has been reserved for expansion up to 1 MiB, but initially only 64 KiB are allocated, and Task Manager only counts the latter. But reserved memory will not be unilaterally repurposed by malloc until the reservation is lifted, so once it runs out of available address space, it has to return NULL.
What tool can show this? I don't know of anything off the shelf (even Process Explorer does not seem show a count of reserved memory). What I have done in the past is write my own little routine that uses VirtualQuery to enumerate the entire address space, including reserved ranges. I recommend you do the same; it's not much code to write, and very handy when coding for 32-bit Windows because the 2 GiB address space gets cramped very easily (DLLs are an obvious reason, but the default malloc also will leave unexpected reservations behind in response to certain allocation patterns even if you free everything).
In any case, if you want to create thousands of threads in a 32-bit Windows process, be sure to pass a non-zero value as the dwStackSize parameter to CreateThread, and also pass STACK_SIZE_PARAM_IS_A_RESERVATION as dwCreationFlags. The minimum is 64 KiB, which will be plenty if you avoid recursive algorithms in the threads.
Addendum: In a comment, #iinspectable cautions against using thousands of threads, citing Raymond Chen's 2005 blog post Does Windows have a limit of 2000 threads per process?. I agree that doing so is questionable for a variety of reasons; it is not my intent to endorse the practice, rather I'm just explaining one necessary element.

How to tell a C++ program to get the system memory? [duplicate]

I want to allocate my buffers according to memory available. Such that, when I do processing and memory usage goes up, but still remains in available memory limits. Is there a way to get available memory (I don't know will virtual or physical memory status will make any difference ?). Method has to be platform Independent as its going to be used on Windows, OS X, Linux and AIX. (And if possible then I would also like to allocate some of available memory for my application, someone it doesn't change during the execution).
Edit: I did it with configurable memory allocation.
I understand it is not good idea, as most OS manage memory for us, but my application was an ETL framework (intended to be used on server, but was also being used on desktop as a plugin for Adobe indesign). So, I was running in to issue of because instead of using swap, windows would return bad alloc and other applications start to fail. And as I was taught to avoid crashes and so, was just trying to degrade gracefully.

On UNIX-like operating systems, there is sysconf.
#include <unistd.h>
unsigned long long getTotalSystemMemory()
{
long pages = sysconf(_SC_PHYS_PAGES);
long page_size = sysconf(_SC_PAGE_SIZE);
return pages * page_size;
}
On Windows, there is GlobalMemoryStatusEx:
#include <windows.h>
unsigned long long getTotalSystemMemory()
{
MEMORYSTATUSEX status;
status.dwLength = sizeof(status);
GlobalMemoryStatusEx(&status);
return status.ullTotalPhys;
}
So just do some fancy #ifdefs and you'll be good to go.

There are reasons to do want to do this in HPC for scientific software. (Not game, web, business or embedded software). Scientific software routinely go through terabytes of data to get through one computation (or run) (and run for hours or weeks) -- all of which cannot be stored in memory (and if one day you tell me a terabyte is standard for any PC or tablet or phone it will be the case that the scientific software will be expected to handle petabytes or more). The amount of memory can also dictate the kind of method/algorithm that makes sense. The user does not always want to decide the memory and method - he/she has other things to worry about. So the programmer should have a good idea of what is available (4Gb or 8Gb or 64Gb or thereabouts these days) to decide whether a method will automatically work or a more laborious method is to be chosen. Disk is used but memory is preferable. And users of such software are not encouraged to be doing too many things on their computer when running such software -- in fact, they often use dedicated machines/servers.

There is no platform independent way to do this, different operating systems use different memory management strategies.
These other stack overflow questions will help:
How to get memory usage at run time in c++?
C/C++ memory usage API in Linux/Windows
You should watch out though: It is notoriously difficult to get a "real" value for available memory in linux. What the operating system displays as used by a process is no guarantee of what is actually allocated for the process.
This is a common issue when developing embedded linux systems such as routers, where you want to buffer as much as the hardware allows. Here is a link to an example showing how to get this information in a linux (in C):
http://www.unix.com/programming/25035-determining-free-available-memory-mv-linux.html

Having read through these answers I'm astonished that so many take the stance that OP's computer memory belongs to others. It's his computer and his memory to do with as he sees fit, even if it breaks other systems taking a claim it. It's an interesting question. On a more primitive system I had memavail() which would tell me this. Why shouldn't the OP take as much memory as he wants without upsetting other systems?
Here's a solution that allocates less than half the memory available, just to be kind. Output was:
Required FFFFFFFF
Required 7FFFFFFF
Required 3FFFFFFF
Memory size allocated = 1FFFFFFF
#include <stdio.h>
#include <stdlib.h>
#define MINREQ 0xFFF // arbitrary minimum
int main(void)
{
unsigned int required = (unsigned int)-1; // adapt to native uint
char *mem = NULL;
while (mem == NULL) {
printf ("Required %X\n", required);
mem = malloc (required);
if ((required >>= 1) < MINREQ) {
if (mem) free (mem);
printf ("Cannot allocate enough memory\n");
return (1);
}
}
free (mem);
mem = malloc (required);
if (mem == NULL) {
printf ("Cannot enough allocate memory\n");
return (1);
}
printf ("Memory size allocated = %X\n", required);
free (mem);
return 0;
}

Mac OS X example using sysctl (man 3 sysctl):
#include <stdio.h>
#include <stdint.h>
#include <sys/types.h>
#include <sys/sysctl.h>
int main(void)
{
int mib[2] = { CTL_HW, HW_MEMSIZE };
u_int namelen = sizeof(mib) / sizeof(mib[0]);
uint64_t size;
size_t len = sizeof(size);
if (sysctl(mib, namelen, &size, &len, NULL, 0) < 0)
{
perror("sysctl");
}
else
{
printf("HW.HW_MEMSIZE = %llu bytes\n", size);
}
return 0;
}
(may also work on other BSD-like operating systems ?)

The code below gives the total and free memory in Megabytes. Works for FreeBSD, but you should be able to use same/similar sysctl tunables on your platform and do to the same thing (Linux & OS X have sysctl at least)
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <sys/sysctl.h>
#include <sys/vmmeter.h>
int main(){
int rc;
u_int page_size;
struct vmtotal vmt;
size_t vmt_size, uint_size;
vmt_size = sizeof(vmt);
uint_size = sizeof(page_size);
rc = sysctlbyname("vm.vmtotal", &vmt, &vmt_size, NULL, 0);
if (rc < 0){
perror("sysctlbyname");
return 1;
}
rc = sysctlbyname("vm.stats.vm.v_page_size", &page_size, &uint_size, NULL, 0);
if (rc < 0){
perror("sysctlbyname");
return 1;
}
printf("Free memory : %ld\n", vmt.t_free * (u_int64_t)page_size);
printf("Available memory : %ld\n", vmt.t_avm * (u_int64_t)page_size);
return 0;
}
Below is the output of the program, compared with the vmstat(8) output on my system.
~/code/memstats % cc memstats.c
~/code/memstats % ./a.out
Free memory : 5481914368
Available memory : 8473378816
~/code/memstats % vmstat
procs memory page disks faults cpu
r b w avm fre flt re pi po fr sr ad0 ad1 in sy cs us sy id
0 0 0 8093M 5228M 287 0 1 0 304 133 0 0 112 9597 1652 2 1 97

Linux currently free memory: sysconf(_SC_AVPHYS_PAGES) and get_avphys_pages()
The total RAM was covered at https://stackoverflow.com/a/2513561/895245 with sysconf(_SC_PHYS_PAGES);.
Both sysconf(_SC_AVPHYS_PAGES) and get_avphys_pages() are glibc extensions to POSIX that give instead the total currently available RAM pages.
You then just have to multiply them by sysconf(_SC_PAGE_SIZE) to obtain the current free RAM.
Minimal runnable example at: C - Check available free RAM?

The "official" function for this is was std::get_temporary_buffer(). However, you might want to test whether your platform has a decent implemenation. I understand that not all platforms behave as desired.

Instead of trying to guess, have you considered letting the user configure how much memory to use for buffers, as well as assuming somewhat conservative defaults? This way you can still run (possibly slightly slower) with no override, but if the user know there is X memory available for the app they can improve performance by configuring that amount.

Here is a proposal to get available memory on Linux platform:
/// Provides the available RAM memory in kibibytes (1 KiB = 1024 B) on Linux platform (Available memory in /proc/meminfo)
/// For more info about /proc/meminfo : https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/6/html/deployment_guide/s2-proc-meminfo
long long getAvailableMemory()
{
long long memAvailable = -1;
std::ifstream meminfo("/proc/meminfo");
std::string line;
while (std::getline(meminfo, line))
{
if (line.find("MemAvailable:") != std::string::npos)
{
const std::size_t firstWhiteSpacePos = line.find_first_of(' ');
const std::size_t firstNonWhiteSpaceChar = line.find_first_not_of(' ', firstWhiteSpacePos);
const std::size_t nextWhiteSpace = line.find_first_of(' ', firstNonWhiteSpaceChar);
const std::size_t numChars = nextWhiteSpace - firstNonWhiteSpaceChar;
const std::string memAvailableStr = line.substr(firstNonWhiteSpaceChar, numChars);
memAvailable = std::stoll(memAvailableStr);
break;
}
}
return memAvailable;
}

What could cause a mutex to misbehave?

I've been busy the last couple of months debugging a rare crash caused somewhere within a very large proprietary C++ image processing library, compiled with GCC 4.7.2 for an ARM Cortex-A9 Linux target. Since a common symptom was glibc complaining about heap corruption, the first step was to employ a heap corruption checker to catch oob memory writes. I used the technique described in https://stackoverflow.com/a/17850402/3779334 to divert all calls to free/malloc to my own function, padding every allocated chunk of memory with some amount of known data to catch out-of-bounds writes - but found nothing, even when padding with as much as 1 KB before and after every single allocated block (there are hundreds of thousands of allocated blocks due to intensive use of STL containers, so I can't enlarge the padding further, plus I assume any write more than 1KB out of bounds would eventually trigger a segfault anyway). This bounds checker has found other problems in the past so I don't doubt its functionality.
(Before anyone says 'Valgrind', yes, I have tried that too with no results either.)
Now, my memory bounds checker also has a feature where it prepends every allocated block with a data struct. These structs are all linked in one long linked list, to allow me to occasionally go over all allocations and test memory integrity. For some reason, even though all manipulations of this list are mutex protected, the list was getting corrupted. When investigating the issue, it began to seem like the mutex itself was occasionally failing to do its job. Here is the pseudocode:
pthread_mutex_t alloc_mutex;
static bool boolmutex; // set to false during init. volatile has no effect.
void malloc_wrapper() {
// ...
pthread_mutex_lock(&alloc_mutex);
if (boolmutex) {
printf("mutex misbehaving\n");
__THROW_ERROR__; // this happens!
}
boolmutex = true;
// manipulate linked list here
boolmutex = false;
pthread_mutex_unlock(&alloc_mutex);
// ...
}
The code commented with "this happens!" is occasionally reached, even though this seems impossible. My first theory was that the mutex data structure was being overwritten. I placed the mutex within a struct, with large arrays before and after it, but when this problem occurred the arrays were untouched so nothing seems to be overwritten.
So.. What kind of corruption could possibly cause this to happen, and how would I find and fix the cause?
A few more notes. The test program uses 3-4 threads for processing. Running with less threads seems to make the corruptions less common, but not disappear. The test runs for about 20 seconds each time and completes successfully in the vast majority of cases (I can have 10 units repeating the test, with the first failure occurring after 5 minutes to several hours). When the problem occurs it is quite late in the test (say, 15 seconds in), so this isn't a bad initialization issue. The memory bounds checker never catches actual out of bounds writes but glibc still occasionally fails with a corrupted heap error (Can such an error be caused by something other than an oob write?). Each failure generates a core dump with plenty of trace information; there is no pattern I can see in these dumps, no particular section of code that shows up more than others. This problem seems very specific to a particular family of algorithms and does not happen in other algorithms, so I'm quite certain this isn't a sporadic hardware or memory error. I have done many more tests to check for oob heap accesses which I don't want to list to keep this post from getting any longer.
Thanks in advance for any help!

Thanks to all commenters. I've tried nearly all suggestions with no results, when I finally decided to write a simple memory allocation stress test - one that would run a thread on each of the CPU cores (my unit is a Freescale i.MX6 quad core SoC), each allocating and freeing memory in random order at high speed. The test crashed with a glibc memory corruption error within minutes or a few hours at most.
Updating the kernel from 3.0.35 to 3.0.101 solved the problem; both the stress test and the image processing algorithm now run overnight without failing. The problem does not reproduce on Intel machines with the same kernel version, so the problem is specific either to ARM in general or perhaps to some patch Freescale included with the specific BSP version that included kernel 3.0.35.
For those curious, attached is the stress test source code. Set NUM_THREADS to the number of CPU cores and build with:
<cross-compiler-prefix>g++ -O3 test_heap.cpp -lpthread -o test_heap
I hope this information helps someone. Cheers :)
// Multithreaded heap stress test. By Itay Chamiel 20151012.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
#include <pthread.h>
#include <sys/time.h>
#define NUM_THREADS 4 // set to number of CPU cores
#define ALIVE_INDICATOR NUM_THREADS
// Each thread constantly allocates and frees memory. In each iteration of the infinite loop, decide at random whether to
// allocate or free a block of memory. A list of 500-1000 allocated blocks is maintained by each thread. When memory is allocated
// it is added to this list; when freeing, a random block is selected from this list, freed and removed from the list.
void* thr(void* arg) {
int* alive_flag = (int*)arg;
int thread_id = *alive_flag; // this is a number between 0 and (NUM_THREADS-1) given by main()
int cnt = 0;
timeval t_pre, t_post;
gettimeofday(&t_pre, NULL);
const int ALLOCATE=1, FREE=0;
const unsigned int MINSIZE=500, MAXSIZE=1000;
const int MAX_ALLOC=10000;
char* membufs[MAXSIZE];
unsigned int membufs_size = 0;
int num_allocs = 0, num_frees = 0;
while(1)
{
int action;
// Decide whether to allocate or free a memory block.
// if we have less than MINSIZE buffers, allocate.
if (membufs_size < MINSIZE) action = ALLOCATE;
// if we have MAXSIZE, free.
else if (membufs_size >= MAXSIZE) action = FREE;
// else, decide randomly.
else {
action = ((rand() & 0x1)? ALLOCATE : FREE);
}
if (action == ALLOCATE) {
// choose size to allocate, from 1 to MAX_ALLOC bytes
size_t size = (rand() % MAX_ALLOC) + 1;
// allocate and fill memory
char* buf = (char*)malloc(size);
memset(buf, 0x77, size);
// add buffer to list
membufs[membufs_size] = buf;
membufs_size++;
assert(membufs_size <= MAXSIZE);
num_allocs++;
}
else { // action == FREE
// choose a random buffer to free
size_t pos = rand() % membufs_size;
assert (pos < membufs_size);
// free and remove from list by replacing entry with last member
free(membufs[pos]);
membufs[pos] = membufs[membufs_size-1];
membufs_size--;
assert(membufs_size >= 0);
num_frees++;
}
// once in 10 seconds print a status update
gettimeofday(&t_post, NULL);
if (t_post.tv_sec - t_pre.tv_sec >= 10) {
printf("Thread %d [%d] - %d allocs %d frees. Alloced blocks %u.\n", thread_id, cnt++, num_allocs, num_frees, membufs_size);
gettimeofday(&t_pre, NULL);
}
// indicate alive to main thread
*alive_flag = ALIVE_INDICATOR;
}
return NULL;
}
int main()
{
int alive_flag[NUM_THREADS];
printf("Memory allocation stress test running on %d threads.\n", NUM_THREADS);
// start a thread for each core
for (int i=0; i<NUM_THREADS; i++) {
alive_flag[i] = i; // tell each thread its ID.
pthread_t th;
int ret = pthread_create(&th, NULL, thr, &alive_flag[i]);
assert(ret == 0);
}
while(1) {
sleep(10);
// check that all threads are alive
bool ok = true;
for (int i=0; i<NUM_THREADS; i++) {
if (alive_flag[i] != ALIVE_INDICATOR)
{
printf("Thread %d is not responding\n", i);
ok = false;
}
}
assert(ok);
for (int i=0; i<NUM_THREADS; i++)
alive_flag[i] = 0;
}
return 0;
}

Why this app doesn't consume as much memory as expected

I wrote a simple application to test memory consumption. In this test application, I created four processes to continually consume memory, those processes won't release the memory unless the process exits.
I expected this test application to consume the most memory of RAM and cause the other application to slow down or crash. But the result is not the same as expected. Below is the code:
#include <stdio.h>
#include <unistd.h>
#include <list>
#include <vector>
using namespace std;
unsigned short calcrc(unsigned char *ptr, int count)
{
unsigned short crc;
unsigned char i;
//high cpu-consumption code
//implements the CRC algorithm
//CRC is Cyclic Redundancy Code
}
void* ForkChild(void* param){
vector<unsigned char*> MemoryVector;
pid_t PID = fork();
if (PID > 0){
const int TEN_MEGA = 10 * 10 * 1024 * 1024;
unsigned char* buffer = NULL;
while(1){
buffer = NULL;
buffer = new unsigned char [TEN_MEGA];
if (buffer){
try{
calcrc(buffer, TEN_MEGA);
MemoryVector.push_back(buffer);
} catch(...){
printf("An error was throwed, but caught by our app!\n");
delete [] buffer;
buffer = NULL;
}
}
else{
printf("no memory to allocate!\n");
try{
if (MemoryVector.size()){
buffer = MemoryVector[0];
calcrc(buffer, TEN_MEGA);
buffer = NULL;
} else {
printf("no memory ever allocated for this Process!\n");
continue;
}
} catch(...){
printf("An error was throwed -- branch 2,"
"but caught by our app!\n");
buffer = NULL;
}
}
} //while(1)
} else if (PID == 0){
} else {
perror("fork error");
}
return NULL;
}
int main(){
int children = 4;
while(--children >= 0){
ForkChild(NULL);
};
while(1) sleep(1);
printf("exiting main process\n");
return 0;
}
TOP command
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2775 steve 20 0 1503m 508 312 R 99.5 0.0 1:00.46 test
2777 steve 20 0 1503m 508 312 R 96.9 0.0 1:00.54 test
2774 steve 20 0 1503m 904 708 R 96.6 0.0 0:59.92 test
2776 steve 20 0 1503m 508 312 R 96.2 0.0 1:00.57 test
Though CPU is high, but memory percent remains 0.0. How can it be possible??
Free command
free shared buffers cached
Mem: 3083796 0 55996 428296
Free memory is more than 3G out of 4G RAM.
Does there anybody know why this test app just doesn't work as expected?

Linux uses optimistic memory allocation: it will not physically allocate a page of memory until that page is actually written to. For that reason, you can allocate much more memory than what is available, without increasing memory consumption by the system.
If you want to force the system to allocate (commit) a physical page , then you have to write to it.
The following line does not issue any write, as it is default-initialization of unsigned char, which is a no-op:
buffer = new unsigned char [TEN_MEGA];
If you want to force a commit, use zero-initialization:
buffer = new unsigned char [TEN_MEGA]();

To make the comments into an answer:
Linux will not allocate memory pages for a process until it writes to them (copy-on-write).
Additionally, you are not writing to your buffer anywhere, as the default constructor for unsigned char does not perform any initializations, and new[] default-initializes all items.

fork() returns the PID in the parent, and 0 in the child. Your ForkChild as written will execute all the work in the parent, not the child.
And the standard new operator will never return null; it will throw if it fails to allocate memory (but due to overcommit it won't actually do that either in Linux). This means your test of buffer after the allocation is meaningless: it will always either take the first branch or never reach the test. If you want a null return, you need to write new (std::nothrow) .... Include <new> for that to work.

But your program is infact doing what you expected it to do. As an answer has pointed out (# Michael Foukarakis's answer), memory not used is not allocated. In your output of the top program, I noticed that the column virt had a large amount of memory on it for each process running your program. A little googling later, I saw what this was:
VIRT -- Virtual Memory Size (KiB). The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been swapped out and pages that have been mapped but not used.
So as you can see, your program does in fact generate memory for itself, but in the form of pages and stored as virtual memory. And I think that is a smart thing to do
A snippet from this wiki page
A page, memory page, or virtual page -- a fixed-length contiguous block of virtual memory, and it is the smallest unit of data for the following:
memory allocation performed by the operating system for a program; and
transfer between main memory and any other auxiliary store, such as a hard disk drive.
...Thus a program can address more (virtual) RAM than physically exists in the computer. Virtual memory is a scheme that gives users the illusion of working with a large block of contiguous memory space (perhaps even larger than real memory), when in actuality most of their work is on auxiliary storage (disk). Fixed-size blocks (pages) or variable-size blocks of the job are read into main memory as needed.
Sources:
http://www.computerhope.com/unix/top.htm
https://stackoverflow.com/a/18917909/2089675
http://en.wikipedia.org/wiki/Page_(computer_memory)

If you want to gobble up a lot of memory:
int mb = 0;
char* buffer;
while (1) {
buffer = malloc(1024*1024);
memset(buffer, 0, 1024*1024);
mb++;
}
I used something like this to make sure the file buffer cache was empty when taking some file I/O timing measurements.
As other answers have already mentioned, your code doesn't ever write to the buffer after allocating it. Here memset is used to write to the buffer.

How to get a "bus error"?

I am trying very hard to get a bus error.
One way is misaligned access and I have tried the examples given here and here, but no error for me - the programs execute just fine.
Is there some situation which is sure to produce a bus error?

This should reliably result in a SIGBUS on a POSIX-compliant system.
#include <unistd.h>
#include <stdio.h>
#include <sys/mman.h>
int main() {
FILE *f = tmpfile();
int *m = mmap(0, 4, PROT_WRITE, MAP_PRIVATE, fileno(f), 0);
*m = 0;
return 0;
}
From the Single Unix Specification, mmap:
References within the address range starting at pa and continuing for len bytes to whole pages following the end of an object shall result in delivery of a SIGBUS signal.

Bus errors can only be invoked on hardware platforms that:
Require aligned access, and
Don't compensate for an unaligned access by performing two aligned accesses and combining the results.
You probably do not have access to such a system.

Try something along the lines of:
#include <signal.h>
int main(void)
{
raise(SIGBUS);
return 0;
}
(I know, probably not the answer you want, but it's almost sure to get you a "bus error"!)

As others have mentioned this is very platform specific. On the ARM system I'm working with (which doesn't have virtual memory) there are large portions of the address space which have no memory or peripheral assigned. If I read or write one of those addresses, I get a bus error.
You can also get a bus error if there's actually a hardware problem on the bus.
If you're running on a platform with virtual memory, you might not be able to intentionally generate a bus error with your program unless it's a device driver or other kernel mode software. An invalid memory access would likely be trapped as an access violation or similar by the memory manager (and it never even has a chance to hit the bus).

on linux with an Intel CPU try this:
int main(int argc, char **argv)
{
# if defined i386
/* enable alignment check (AC) */
asm("pushf; "
"orl $(1<<18), (%esp); "
"popf;");
# endif
char d[] = "12345678"; /* yep! - causes SIGBUS even on Linux-i386 */
return 0;
}
the trick here is to set the "alignment check" bit in one of the CPUs "special" registers.
see also: here

I am sure that you must be using x86 machines.
X86 cpu does not generate bus error unless its AC flag in EFALAGS register is set.
Try this code:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char *p;
__asm__("pushf\n"
"orl $0x40000, (%rsp)\n"
"popf");
/*
* malloc() always provides aligned memory.
* Do not use stack variable like a[9], depending on the compiler you use,
* a may not be aligned properly.
*/
p = malloc(sizeof(int) + 1);
memset(p, 0, sizeof(int) + 1);
/* making p unaligned */
p++;
printf("%d\n", *(int *)p);
return 0;
}
More about this can be found at http://orchistro.tistory.com/206

Also keep in mind that some operating systems report "bus error" for errors other than misaligned access. You didn't mention in your question what it was you were actually trying to acheive. Maybe try thus:
int *x = 0;
*x=1;
the Wikipedia page you linked to mentions that access to non-existant memory can also result is a bus error. You might have better luck with loading a known-invalid address into a pointer and dereferwncing that.

How about this? untested.
#include<stdio.h>
typedef struct
{
int a;
int b;
} busErr;
int main()
{
busErr err;
char * cPtr;
int *iPtr;
cPtr = (char *)&err;
cPtr++;
iPtr = (int *)cPtr;
*iPtr = 10;
}

int main(int argc, char **argv)
{
char *bus_error = new char[1];
for (int i=0; i<1000000000;i++) {
bus_error += 0xFFFFFFFFFFFFFFF;
*(bus_error + 0xFFFFFFFFFFFFFF) = 'X';
}
}
Bus error: 10 (core dumped)

Simple, write to memory that isn't yours:
int main()
{
char *bus_error = 0;
*bus_error = 'X';
}
Instant bus error on my PowerPC Mac [OS X 10.4, dual 1ghz PPC7455's], not necessarily on your hardware and/or operating system.
There's even a wikipedia article about bus errors, including a program to make one.

For 0x86 arch:
#include <stdio.h>
int main()
{
#if defined(__GNUC__)
# if defined(__i386__)
/* Enable Alignment Checking on x86 */
__asm__("pushf\norl $0x40000,(%esp)\npopf");
# elif defined(__x86_64__)
/* Enable Alignment Checking on x86_64 */
__asm__("pushf\norl $0x40000,(%rsp)\npopf");
# endif
#endif
int b = 0;
int a = 0xffffff;
char *c = (char*)&a;
c++;
int *p = (int*)c;
*p = 10; //Bus error as memory accessed by p is not 4 or 8 byte aligned
printf ("%d\n", sizeof(a));
printf ("%x\n", *p);
printf ("%x\n", p);
printf ("%x\n", &a);
}
Note:If asm instructions are removed, code wont generate the SIGBUS error as suggested by others.
SIGBUS can occur for other reason too.

Bus errors occur if you try to access memory that is not addressable by your computer. For example, your computer's memory has an address range 0x00 to 0xFF but you try to access a memory element at 0x0100 or greater.
In reality, your computer will have a much greater range than 0x00 to 0xFF.
To answer your original post:
Tell me some situation which is sure to produce a bus error.
In your code, index into memory way outside the scope of the max memory limit. I dunno ... use some kind of giant hex value 0xFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF indexed into a char* ...

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js