VSZ and RSS kept increasing on AIX

VSZ and RSS kept increasing on AIX - c++

I see an abnormal memory usage pattern while running my application program on AIX...
I have created a simple program to malloc and free replicate the same problem.
int main()
{
int *ptr_one;
// enter value as 0.
// I wanted few secs fetch the PID of this statndlone process
// and run 'ps -p <PID> -o "vsz rssize"'
long a;
scanf("%ld", &a);
for(;;)
{
if(a < 10000000) a = a + 100;
ptr_one = (int *)malloc(sizeof(int)*a);
if (ptr_one == 0){
printf("ERROR: Out of memory\n");
return 1;
}
*ptr_one = 25;
printf("%d\n", *ptr_one);
free(ptr_one);
}
return 0;
}
I have captured the memory usage of this program using the below command,
ps -p $1 -o "vsz rssize" | tail -1 >> out.txt
The graph tells the memory kept growing and not released.
Is this a sign of leak or this is a normal memory behavior on AIX?

It is fully correct that process'es memory usage size is not decreased: while malloc can request additional memory for the process, free never return it to the system. Instead, freeing memory is reused in future malloc calls.

Related

32-bit malloc() return NULL when opening many threads?

I have a sample C++ program as below:
#include <windows.h>
#include <stdio.h>
int main(int argc, char* argv[])
{
void * pointerArr[20000];
int i = 0, j;
for (i = 0; i < 20000; i++) {
void * pointer = malloc(131125);
if (pointer == NULL) {
printf("i = %d, out of memory!\n", i);
getchar();
break;
}
pointerArr[i] = pointer;
}
for (j = 0; j < i; j++) {
free(pointerArr[j]);
}
getchar();
return 0;
}
When I run it with Visual Studio 32-bit Debug, it will run with following result:
The program can use nearly 2Gb of memory before out of memory.
This is normal behavior.
However, when I adding the code to start Thread inside the for loop as below:
#include <windows.h>
#include <stdio.h>
DWORD WINAPI thread_func(VOID* pInArgs)
{
Sleep(100000);
return 0;
}
int main(int argc, char* argv[])
{
void * pointerArr[20000];
int i = 0, j;
for (i = 0; i < 20000; i++) {
CreateThread(NULL, 0, thread_func, NULL, 0, NULL);
void * pointer = malloc(131125);
if (pointer == NULL) {
printf("i = %d, out of memory!\n", i);
getchar();
break;
}
pointerArr[i] = pointer;
}
for (j = 0; j < i; j++) {
free(pointerArr[j]);
}
getchar();
return 0;
}
The result is as below:
The memory is still just around 200Mb but function malloc will return NULL.
Could anyone help explain why the program cannot use the memory up to 2Gb before out of memory?
Is it mean creating many threads like above will cause memory leak?
In my real application, this error occur when I create about 800 threads, the RAM memory at the time "out of memory" is around 300Mb.

As noted in a comment by #macroland, the main thing happening here is that each thread is consuming 1 MiB for its stack (see MSDN CreateThread and Thread Stack Size). You say malloc returns NULL once the total you have directly allocated reaches 200 MB. Since you are allocating 131125 bytes at a time, that is 200 MB / 131125 B = 1525 threads. Their cumulative stack space will be around 1.5 GB. Adding the 200 MB of malloc memory is 1.7 GB, and miscellaneous overhead likely accounts for the rest.
So, why does Task Manager not show this? Because the full 1 MiB of thread stack space is not actually allocated (also called committed), rather it is reserved. See VirtualAlloc and the MEM_RESERVE flag. The address space has been reserved for expansion up to 1 MiB, but initially only 64 KiB are allocated, and Task Manager only counts the latter. But reserved memory will not be unilaterally repurposed by malloc until the reservation is lifted, so once it runs out of available address space, it has to return NULL.
What tool can show this? I don't know of anything off the shelf (even Process Explorer does not seem show a count of reserved memory). What I have done in the past is write my own little routine that uses VirtualQuery to enumerate the entire address space, including reserved ranges. I recommend you do the same; it's not much code to write, and very handy when coding for 32-bit Windows because the 2 GiB address space gets cramped very easily (DLLs are an obvious reason, but the default malloc also will leave unexpected reservations behind in response to certain allocation patterns even if you free everything).
In any case, if you want to create thousands of threads in a 32-bit Windows process, be sure to pass a non-zero value as the dwStackSize parameter to CreateThread, and also pass STACK_SIZE_PARAM_IS_A_RESERVATION as dwCreationFlags. The minimum is 64 KiB, which will be plenty if you avoid recursive algorithms in the threads.
Addendum: In a comment, #iinspectable cautions against using thousands of threads, citing Raymond Chen's 2005 blog post Does Windows have a limit of 2000 threads per process?. I agree that doing so is questionable for a variety of reasons; it is not my intent to endorse the practice, rather I'm just explaining one necessary element.

reliability of proc/statm for finding memory leak

I am trying to find a slow memory leak in a large application.
ps shows the VSZ growing slowly until the application crashes after running for 12-18 hours. Unfortunately, valgrind, leakcheak, etc have not been useful (Valgrind fails with illegal instruction).
Alternatively, I've been printing the contents of /proc/statm over time, and approximately every 10s I see the first field of statm (total program size) increase by 20-30 bytes.
I've tracked it down to one function but it doesn't make sense. The offending function reads a directory and performs a clear() on a std::set. What in the function would increase the memory footprint? And... why doesn't the memory reduce once the directory is closed?
Trace Output:
DvCfgProfileList::obtainSystemProfileList() PRE MEMORY USAGE: 27260 11440 7317 15 0 12977 0
DvCfgProfileList::obtainSystemProfileList() MID 1 MEMORY USAGE: 27296 11440 7317 15 0 13013 0
DvCfgProfileList::obtainSystemProfileList() MID 2 MEMORY USAGE: 27296 11443 7317 15 0 13013 0
DvCfgProfileList::obtainSystemProfileList POST MEMORY USAGE: 27288 11443 7317 15 0 13005 0
The Big Question
Can I rely on reading /proc/statm for an immediate reading of process memory? This Unix/Linux Posting says it is "updated on every access".
If true, then why does it indicate that obtainSystemProfileList() is leaking?
EDIT I
I added the link to the Unix/Linux post. So if reads of /proc/.../statm result in a direct and immediate kernel call, then is there some time delay in the kernel updating its own internal results? If indeed there is no memory leak in the code fragment, then what else explains the change in mem values across a few lines of code?
EDIT II
Would calling getrusage() provide a more immediate and accurate view of process memory use? (or does it just make the same, potentially delayed, kernel calls as reading /proc/.../statm ?
Kernel is 32-bit 3.10.80-1 if that makes any difference...
Code Fragment:
bool
DvCfgProfileList::obtainSystemProfileList()
{
TRACE(("DvCfgProfileList::obtainSystemProfileList() PRE "));
DvComUtil::printMemoryUsage();
DIR *pDir = opendir(SYSTEM_PROFILE_DIRECTORY);
if (pDir == 0)
{
mkdir(SYSTEM_PROFILE_DIRECTORY, S_IRWXU | S_IRWXG | S_IRWXO);
pDir = opendir(SYSTEM_PROFILE_DIRECTORY);
if (pDir == 0)
{
TRACE(("%s does not exist or cannot be created\n", SYSTEM_PROFILE_DIRECTORY));
return false;
}
}
TRACE(("DvCfgProfileList::obtainSystemProfileList() MID 1 "));
DvComUtil::printMemoryUsage();
mpProfileList->clearSystemProfileList(); // calls (std::set) mProfileList.clear()
TRACE(("DvCfgProfileList::obtainSystemProfileList() MID 2 "));
DvComUtil::printMemoryUsage();
struct dirent *pEntry;
while ((pEntry = readdir(pDir)) != 0)
{
if (!strcmp(pEntry->d_name, ".") || !strcmp(pEntry->d_name, ".."))
continue;
TRACE(("Profile name = %s\n", pEntry->d_name));
mpProfileList->addSystemProfile(std::string(pEntry->d_name));
}
closedir(pDir);
printf("DvCfgProfileList::obtainSystemProfileList POST ");
DvComUtil::printMemoryUsage();
return true;
}
/* static */ void
DvComUtil::printMemoryUsage()
{
char fname[256], line[256];
sprintf(fname, "/proc/%d/statm", getpid());
FILE *pFile = fopen(fname, "r");
if (!pFile)
return;
fgets(line, 255, pFile);
fclose(pFile);
printf("MEMORY USAGE: %s", line);
}

Memory occupation increase

I am trapped in a wired situation; my c++ code keeps consuming more memory (reaching around 70G), until the whole process got killed.
I am invoking a C++ code from Python, which implements the Longest common subsequence length algorithm.
The C++ code is shown below:
#define MAX(a,b) (((a)>(b))?(a):(b))
#include <stdio.h>
int LCSLength(long unsigned X[], long unsigned Y[], int m, int n)
{
int** L = new int*[m+1];
for(int i = 0; i < m+1; ++i)
L[i] = new int[n+1];
printf("i am hre\n");
int i, j;
for(i=0; i<=m; i++)
{
printf("i am hre1\n");
for(j=0; j<=n; j++)
{
if(i==0 || j==0)
L[i][j] = 0;
else if(X[i-1]==Y[j-1])
L[i][j] = L[i-1][j-1]+1;
else
L[i][j] = MAX(L[i-1][j],L[i][j-1]);
}
}
int tt = L[m][n];
printf("i am hre2\n");
for (i = 0; i < m+1; i++)
delete [] L[i];
delete [] L;
return tt;
}
And my Python code is like this:
from ctypes import cdll
import ctypes
lib = cdll.LoadLibrary('./liblcs.so')
la = 36840
lb = 833841
a = (ctypes.c_ulong * la)()
b = (ctypes.c_ulong * lb)()
for i in range(la):
a[i] = 1
for i in range(lb):
b[i] = 1
print "test"
lib._Z9LCSLengthPmS_ii(a, b, la, lb)
IMHO, in the C++ code, after the new operation which could allocate a large amount of memory on the heap, there would be not more additional memory consumption inside the loop.
However, to my surprise, I observed that the used memory keeps increasing during the loop. (I am using top on Linux, and it keeps print i am her1 before the process got killed)
It is really confused me at this point, as I guess after the memory allocation, there are only some arithmetic operations inside the loop, why does the code take more memory?
Am I clear enough? Could anyone give me some help on this issue? Thank you!

Your consuming too much memory. The reason why the system does not die on allocation is because Linux allows you to allocate more memory than you can use
http://serverfault.com/questions/141988/avoid-linux-out-of-memory-application-teardown
I just did the same thing on a test machine. I was able to get past the uses of new and start the loop, only when the system decided that I was eating too much of the available RAM did it kill me.
This is what I got. A lovely OOM message in dmesg.
[287602.898843] Out of memory: Kill process 7476 (a.out) score 792 or sacrifice child
[287602.899900] Killed process 7476 (a.out) total-vm:2885212kB, anon-rss:907032kB, file-rss:0kB, shmem-rss:0kB
On Linux you would see something like this in your kernel logs or as the output from dmesg...
[287585.306678] Out of memory: Kill process 7469 (a.out) score 787 or sacrifice child
[287585.307759] Killed process 7469 (a.out) total-vm:2885208kB, anon-rss:906912kB, file-rss:4kB, shmem-rss:0kB
[287602.754624] a.out invoked oom-killer: gfp_mask=0x24201ca, order=0, oom_score_adj=0
[287602.755843] a.out cpuset=/ mems_allowed=0
[287602.756482] CPU: 0 PID: 7476 Comm: a.out Not tainted 4.5.0-x86_64-linode65 #2
[287602.757592] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.2-0-g33fbe13 by qemu-project.org 04/01/2014
[287602.759461] 0000000000000000 ffff88003d845780 ffffffff815abd27 0000000000000000
[287602.760689] 0000000000000282 ffff88003a377c58 ffffffff811d0e82 ffff8800397f8270
[287602.761915] 0000000000f7d192 000105902804d798 ffffffff81046a71 ffff88003d845780
[287602.763192] Call Trace:
[287602.763532] [<ffffffff815abd27>] ? dump_stack+0x63/0x84
[287602.774614] [<ffffffff811d0e82>] ? dump_header+0x59/0x1ed
[287602.775454] [<ffffffff81046a71>] ? kvm_clock_read+0x1b/0x1d
[287602.776322] [<ffffffff8112b046>] ? ktime_get+0x49/0x91
[287602.777127] [<ffffffff81156c83>] ? delayacct_end+0x3b/0x60
[287602.777970] [<ffffffff81187c11>] ? oom_kill_process+0xc0/0x367
[287602.778866] [<ffffffff811882c5>] ? out_of_memory+0x3bf/0x406
[287602.779755] [<ffffffff8118c646>] ? __alloc_pages_nodemask+0x8fc/0xa6b
[287602.780756] [<ffffffff811c095d>] ? alloc_pages_current+0xbc/0xe0
[287602.781686] [<ffffffff81186c1d>] ? filemap_fault+0x2d3/0x48b
[287602.782561] [<ffffffff8128adea>] ? ext4_filemap_fault+0x37/0x51
[287602.783511] [<ffffffff811a9d56>] ? __do_fault+0x68/0xb1
[287602.784310] [<ffffffff811adcaa>] ? handle_mm_fault+0x6a4/0xd1b
[287602.785216] [<ffffffff810496cd>] ? __do_page_fault+0x33d/0x398
[287602.786124] [<ffffffff819c6ab8>] ? async_page_fault+0x28/0x30

Take a look at what you are doing:
#include <iostream>
int main(){
int m = 36840;
int n = 833841;
unsigned long total = 0;
total += (sizeof(int) * (m+1));
for(int i = 0; i < m+1; ++i){
total += (sizeof(int) * (n+1));
}
std::cout << total << '\n';
}
You're simply consuming too much memory.
If the size of your int is 4 bytes, you are allocating 122 GB.

What could cause a mutex to misbehave?

I've been busy the last couple of months debugging a rare crash caused somewhere within a very large proprietary C++ image processing library, compiled with GCC 4.7.2 for an ARM Cortex-A9 Linux target. Since a common symptom was glibc complaining about heap corruption, the first step was to employ a heap corruption checker to catch oob memory writes. I used the technique described in https://stackoverflow.com/a/17850402/3779334 to divert all calls to free/malloc to my own function, padding every allocated chunk of memory with some amount of known data to catch out-of-bounds writes - but found nothing, even when padding with as much as 1 KB before and after every single allocated block (there are hundreds of thousands of allocated blocks due to intensive use of STL containers, so I can't enlarge the padding further, plus I assume any write more than 1KB out of bounds would eventually trigger a segfault anyway). This bounds checker has found other problems in the past so I don't doubt its functionality.
(Before anyone says 'Valgrind', yes, I have tried that too with no results either.)
Now, my memory bounds checker also has a feature where it prepends every allocated block with a data struct. These structs are all linked in one long linked list, to allow me to occasionally go over all allocations and test memory integrity. For some reason, even though all manipulations of this list are mutex protected, the list was getting corrupted. When investigating the issue, it began to seem like the mutex itself was occasionally failing to do its job. Here is the pseudocode:
pthread_mutex_t alloc_mutex;
static bool boolmutex; // set to false during init. volatile has no effect.
void malloc_wrapper() {
// ...
pthread_mutex_lock(&alloc_mutex);
if (boolmutex) {
printf("mutex misbehaving\n");
__THROW_ERROR__; // this happens!
}
boolmutex = true;
// manipulate linked list here
boolmutex = false;
pthread_mutex_unlock(&alloc_mutex);
// ...
}
The code commented with "this happens!" is occasionally reached, even though this seems impossible. My first theory was that the mutex data structure was being overwritten. I placed the mutex within a struct, with large arrays before and after it, but when this problem occurred the arrays were untouched so nothing seems to be overwritten.
So.. What kind of corruption could possibly cause this to happen, and how would I find and fix the cause?
A few more notes. The test program uses 3-4 threads for processing. Running with less threads seems to make the corruptions less common, but not disappear. The test runs for about 20 seconds each time and completes successfully in the vast majority of cases (I can have 10 units repeating the test, with the first failure occurring after 5 minutes to several hours). When the problem occurs it is quite late in the test (say, 15 seconds in), so this isn't a bad initialization issue. The memory bounds checker never catches actual out of bounds writes but glibc still occasionally fails with a corrupted heap error (Can such an error be caused by something other than an oob write?). Each failure generates a core dump with plenty of trace information; there is no pattern I can see in these dumps, no particular section of code that shows up more than others. This problem seems very specific to a particular family of algorithms and does not happen in other algorithms, so I'm quite certain this isn't a sporadic hardware or memory error. I have done many more tests to check for oob heap accesses which I don't want to list to keep this post from getting any longer.
Thanks in advance for any help!

Thanks to all commenters. I've tried nearly all suggestions with no results, when I finally decided to write a simple memory allocation stress test - one that would run a thread on each of the CPU cores (my unit is a Freescale i.MX6 quad core SoC), each allocating and freeing memory in random order at high speed. The test crashed with a glibc memory corruption error within minutes or a few hours at most.
Updating the kernel from 3.0.35 to 3.0.101 solved the problem; both the stress test and the image processing algorithm now run overnight without failing. The problem does not reproduce on Intel machines with the same kernel version, so the problem is specific either to ARM in general or perhaps to some patch Freescale included with the specific BSP version that included kernel 3.0.35.
For those curious, attached is the stress test source code. Set NUM_THREADS to the number of CPU cores and build with:
<cross-compiler-prefix>g++ -O3 test_heap.cpp -lpthread -o test_heap
I hope this information helps someone. Cheers :)
// Multithreaded heap stress test. By Itay Chamiel 20151012.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <assert.h>
#include <pthread.h>
#include <sys/time.h>
#define NUM_THREADS 4 // set to number of CPU cores
#define ALIVE_INDICATOR NUM_THREADS
// Each thread constantly allocates and frees memory. In each iteration of the infinite loop, decide at random whether to
// allocate or free a block of memory. A list of 500-1000 allocated blocks is maintained by each thread. When memory is allocated
// it is added to this list; when freeing, a random block is selected from this list, freed and removed from the list.
void* thr(void* arg) {
int* alive_flag = (int*)arg;
int thread_id = *alive_flag; // this is a number between 0 and (NUM_THREADS-1) given by main()
int cnt = 0;
timeval t_pre, t_post;
gettimeofday(&t_pre, NULL);
const int ALLOCATE=1, FREE=0;
const unsigned int MINSIZE=500, MAXSIZE=1000;
const int MAX_ALLOC=10000;
char* membufs[MAXSIZE];
unsigned int membufs_size = 0;
int num_allocs = 0, num_frees = 0;
while(1)
{
int action;
// Decide whether to allocate or free a memory block.
// if we have less than MINSIZE buffers, allocate.
if (membufs_size < MINSIZE) action = ALLOCATE;
// if we have MAXSIZE, free.
else if (membufs_size >= MAXSIZE) action = FREE;
// else, decide randomly.
else {
action = ((rand() & 0x1)? ALLOCATE : FREE);
}
if (action == ALLOCATE) {
// choose size to allocate, from 1 to MAX_ALLOC bytes
size_t size = (rand() % MAX_ALLOC) + 1;
// allocate and fill memory
char* buf = (char*)malloc(size);
memset(buf, 0x77, size);
// add buffer to list
membufs[membufs_size] = buf;
membufs_size++;
assert(membufs_size <= MAXSIZE);
num_allocs++;
}
else { // action == FREE
// choose a random buffer to free
size_t pos = rand() % membufs_size;
assert (pos < membufs_size);
// free and remove from list by replacing entry with last member
free(membufs[pos]);
membufs[pos] = membufs[membufs_size-1];
membufs_size--;
assert(membufs_size >= 0);
num_frees++;
}
// once in 10 seconds print a status update
gettimeofday(&t_post, NULL);
if (t_post.tv_sec - t_pre.tv_sec >= 10) {
printf("Thread %d [%d] - %d allocs %d frees. Alloced blocks %u.\n", thread_id, cnt++, num_allocs, num_frees, membufs_size);
gettimeofday(&t_pre, NULL);
}
// indicate alive to main thread
*alive_flag = ALIVE_INDICATOR;
}
return NULL;
}
int main()
{
int alive_flag[NUM_THREADS];
printf("Memory allocation stress test running on %d threads.\n", NUM_THREADS);
// start a thread for each core
for (int i=0; i<NUM_THREADS; i++) {
alive_flag[i] = i; // tell each thread its ID.
pthread_t th;
int ret = pthread_create(&th, NULL, thr, &alive_flag[i]);
assert(ret == 0);
}
while(1) {
sleep(10);
// check that all threads are alive
bool ok = true;
for (int i=0; i<NUM_THREADS; i++) {
if (alive_flag[i] != ALIVE_INDICATOR)
{
printf("Thread %d is not responding\n", i);
ok = false;
}
}
assert(ok);
for (int i=0; i<NUM_THREADS; i++)
alive_flag[i] = 0;
}
return 0;
}

Why this app doesn't consume as much memory as expected

I wrote a simple application to test memory consumption. In this test application, I created four processes to continually consume memory, those processes won't release the memory unless the process exits.
I expected this test application to consume the most memory of RAM and cause the other application to slow down or crash. But the result is not the same as expected. Below is the code:
#include <stdio.h>
#include <unistd.h>
#include <list>
#include <vector>
using namespace std;
unsigned short calcrc(unsigned char *ptr, int count)
{
unsigned short crc;
unsigned char i;
//high cpu-consumption code
//implements the CRC algorithm
//CRC is Cyclic Redundancy Code
}
void* ForkChild(void* param){
vector<unsigned char*> MemoryVector;
pid_t PID = fork();
if (PID > 0){
const int TEN_MEGA = 10 * 10 * 1024 * 1024;
unsigned char* buffer = NULL;
while(1){
buffer = NULL;
buffer = new unsigned char [TEN_MEGA];
if (buffer){
try{
calcrc(buffer, TEN_MEGA);
MemoryVector.push_back(buffer);
} catch(...){
printf("An error was throwed, but caught by our app!\n");
delete [] buffer;
buffer = NULL;
}
}
else{
printf("no memory to allocate!\n");
try{
if (MemoryVector.size()){
buffer = MemoryVector[0];
calcrc(buffer, TEN_MEGA);
buffer = NULL;
} else {
printf("no memory ever allocated for this Process!\n");
continue;
}
} catch(...){
printf("An error was throwed -- branch 2,"
"but caught by our app!\n");
buffer = NULL;
}
}
} //while(1)
} else if (PID == 0){
} else {
perror("fork error");
}
return NULL;
}
int main(){
int children = 4;
while(--children >= 0){
ForkChild(NULL);
};
while(1) sleep(1);
printf("exiting main process\n");
return 0;
}
TOP command
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2775 steve 20 0 1503m 508 312 R 99.5 0.0 1:00.46 test
2777 steve 20 0 1503m 508 312 R 96.9 0.0 1:00.54 test
2774 steve 20 0 1503m 904 708 R 96.6 0.0 0:59.92 test
2776 steve 20 0 1503m 508 312 R 96.2 0.0 1:00.57 test
Though CPU is high, but memory percent remains 0.0. How can it be possible??
Free command
free shared buffers cached
Mem: 3083796 0 55996 428296
Free memory is more than 3G out of 4G RAM.
Does there anybody know why this test app just doesn't work as expected?

Linux uses optimistic memory allocation: it will not physically allocate a page of memory until that page is actually written to. For that reason, you can allocate much more memory than what is available, without increasing memory consumption by the system.
If you want to force the system to allocate (commit) a physical page , then you have to write to it.
The following line does not issue any write, as it is default-initialization of unsigned char, which is a no-op:
buffer = new unsigned char [TEN_MEGA];
If you want to force a commit, use zero-initialization:
buffer = new unsigned char [TEN_MEGA]();

To make the comments into an answer:
Linux will not allocate memory pages for a process until it writes to them (copy-on-write).
Additionally, you are not writing to your buffer anywhere, as the default constructor for unsigned char does not perform any initializations, and new[] default-initializes all items.

fork() returns the PID in the parent, and 0 in the child. Your ForkChild as written will execute all the work in the parent, not the child.
And the standard new operator will never return null; it will throw if it fails to allocate memory (but due to overcommit it won't actually do that either in Linux). This means your test of buffer after the allocation is meaningless: it will always either take the first branch or never reach the test. If you want a null return, you need to write new (std::nothrow) .... Include <new> for that to work.

But your program is infact doing what you expected it to do. As an answer has pointed out (# Michael Foukarakis's answer), memory not used is not allocated. In your output of the top program, I noticed that the column virt had a large amount of memory on it for each process running your program. A little googling later, I saw what this was:
VIRT -- Virtual Memory Size (KiB). The total amount of virtual memory used by the task. It includes all code, data and shared libraries plus pages that have been swapped out and pages that have been mapped but not used.
So as you can see, your program does in fact generate memory for itself, but in the form of pages and stored as virtual memory. And I think that is a smart thing to do
A snippet from this wiki page
A page, memory page, or virtual page -- a fixed-length contiguous block of virtual memory, and it is the smallest unit of data for the following:
memory allocation performed by the operating system for a program; and
transfer between main memory and any other auxiliary store, such as a hard disk drive.
...Thus a program can address more (virtual) RAM than physically exists in the computer. Virtual memory is a scheme that gives users the illusion of working with a large block of contiguous memory space (perhaps even larger than real memory), when in actuality most of their work is on auxiliary storage (disk). Fixed-size blocks (pages) or variable-size blocks of the job are read into main memory as needed.
Sources:
http://www.computerhope.com/unix/top.htm
https://stackoverflow.com/a/18917909/2089675
http://en.wikipedia.org/wiki/Page_(computer_memory)

If you want to gobble up a lot of memory:
int mb = 0;
char* buffer;
while (1) {
buffer = malloc(1024*1024);
memset(buffer, 0, 1024*1024);
mb++;
}
I used something like this to make sure the file buffer cache was empty when taking some file I/O timing measurements.
As other answers have already mentioned, your code doesn't ever write to the buffer after allocating it. Here memset is used to write to the buffer.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js