Map causing Seg Fault. How to increase memory? - c++

I have a simple question. I have a few files, one file is around ~20000 lines.
It has 5 fields, have some other adt (vectors and lists), but those do not cause a segfault.
The map itself will store a key value, equivalent to about 1 per line.
When I added a map to my code, I would instantly get a segfault, I copied 5000 of 20000 lines, and receive a segfault, then 1000, and it worked.
In java there is a way to increase the amount of virtually allocated memory, is there a way to do so in c++? I have even deleted elements as they are no longer used, and I can get around 2000 lines, but not more.
Here is gdb:
(gdb) exec-file readin
(gdb) run
Starting program: /x/x/x/readin readin
Program exited normally.
valgrind:
HEAP SUMMARY:
==7948== in use at exit: 0 bytes in 0 blocks
==7948== total heap usage: 20,206 allocs, 20,206 frees, 2,661,509 bytes allocated
==7948==
==7948== All heap blocks were freed -- no leaks are possible
code:
....
Flow flw = endQueue.top();
stringstream str1;
stringstream str2;
if (flw.getSrc() < flw.getDest()){
str1 << flw.getSrc();
str2 << flw.getDest();
flw_src_dest = str1.str() + "-" + str2.str();
} else {
str1 << flw.getSrc();
str2 << flw.getDest();
flw_src_dest = str2.str() + "-" + str1.str();
}
while (int_start > flw.getEnd()){
if(flw.getFlow() == 1){
ava_bw[flw_src_dest] += 5.5;
} else {
ava_bw[flw_src_dest] += 2.5;
}
endQueue.pop();
}

A segmentation fault doesn't necessarily indicate that you're out of memory. In fact, with C++, it's highly unlikely: you would usually get a bad_alloc or somesuch in this case (unless you're dumping everything in objects with automatic storage duration?!).
More likely, you have a memory corruption bug in your code, that just so happens to only be noticeable when you have more than a certain number of objects.
At any rate, the solution to memory faults is not to blindly throw more memory at the program.
Run your code through valgrind and through a debugger, and see what the real problem is.

Be careful erasing elements from a container while you are iterating over the container.
for (pos = ava_bw.begin(); pos != ava_bw.end(); ++pos) {
if (pos->second == INIT){
ava_bw.erase(pos);
}
}
I believe this will have pos pointing to the next value but then ++pos will advance it yet again. If erase(pos) resulted in pos pointing at ava_bw.end(), the ++pos will fail.
I know if you tried this with a vector, pos will be invalidated.
Edit
In the while loop you do
while (int_start > flw.getEnd()){
if(flw.getFlow() == 1){
ava_bw[flw_src_dest] += 5.5;
} else {
ava_bw[flw_src_dest] += 2.5;
}
endQueue.pop();
}
You need to do flw = endQueue.top() again.

Generally speaking, in C\C++ max amount of available heap isn't fixed at start of the program -- you can always allocate some more memory, either via direct usage of new/malloc or by using STL containers, such as stl::list, which can do it by themselves.

I don't think the problem is memory, as C++ gets as much memory as it asks for, even hogging all available memory on your PC. Look if you delete something you access later on.

Related

Memory error at lower limit than expected when implementing Sieve of Eratosthenes in C++

My question is related to a problem described here. I have written a C++ implementation of the Sieve of Eratosthenes that hits a memory overflow if I set the target value too high. As suggested in that question, I am able to fix the problem by using a boolean <vector> instead of a normal array.
However, I am hitting the memory overflow at a much lower value than expected, around n = 1 200 000. The discussion in the thread linked above suggests that the normal C++ boolean array uses a byte for each entry, so with 2 GB of RAM, I expect to be able to get to somewhere on the order of n = 2 000 000 000. Why is the practical memory limit so much smaller?
And why does using <vector>, which encodes the booleans as bits instead of bytes, yield more than an eightfold increase in the computable limit?
Here is a working example of my code, with n set to a small value.
#include <iostream>
#include <cmath>
#include <vector>
using namespace std;
int main() {
// Count and sum of primes below target
const int target = 100000;
// Code I want to use:
bool is_idx_prime[target];
for (unsigned int i = 0; i < target; i++) {
// initialize by assuming prime
is_idx_prime[i] = true;
}
// But doesn't work for target larger than ~1200000
// Have to use this instead
// vector <bool> is_idx_prime(target, true);
for (unsigned int i = 2; i < sqrt(target); i++) {
// All multiples of i * i are nonprime
// If i itself is nonprime, no need to check
if (is_idx_prime[i]) {
for (int j = i; i * j < target; j++) {
is_idx_prime[i * j] = 0;
}
}
}
// 0 and 1 are nonprime by definition
is_idx_prime[0] = 0; is_idx_prime[1] = 0;
unsigned long long int total = 0;
unsigned int count = 0;
for (int i = 0; i < target; i++) {
// cout << "\n" << i << ": " << is_idx_prime[i];
if (is_idx_prime[i]) {
total += i;
count++;
}
}
cout << "\nCount: " << count;
cout << "\nTotal: " << total;
return 0;
}
outputs
Count: 9592
Total: 454396537
C:\Users\[...].exe (process 1004) exited with code 0.
Press any key to close this window . . .
Or, changing n = 1 200 000 yields
C:\Users\[...].exe (process 3144) exited with code -1073741571.
Press any key to close this window . . .
I am using the Microsoft Visual Studio interpreter on Windows with the default settings.
Turning the comment into a full answer:
Your operating system reserves a special section in the memory to represent the call stack of your program. Each function call pushes a new stack frame onto the stack. If the function returns, the stack frame is removed from the stack. The stack frame includes the memory for the parameters to your function and the local variables of the function. The remaining memory is referred to as the heap. On the heap, arbitrary memory allocations can be made, whereas the structure of the stack is governed by the control flow of your program. A limited amount of memory is reserved for the stack, when it gets full (e.g. due to too many nested function calls or due to too large local objects), you get a stack overflow. For this reason, large objects should be allocated on the heap.
General references on stack/heap: Link, Link
To allocate memory on the heap in C++, you can:
Use vector<bool> is_idx_prime(target);, which internally does a heap allocation and deallocates the memory for you when the vector goes out of scope. This is the most convenient way.
Use a smart pointer to manage the allocation: auto is_idx_prime = std::make_unique<bool[]>(target); This will also automatically deallocate the memory when the array goes out of scope.
Allocate the memory manually. I am mentioning this only for educational purposes. As mentioned by Paul in the comments, doing a manual memory allocation is generally not advisable, because you have to manually deallocate the memory again. If you have a large program with many memory allocations, inevitably you will forget to free some allocation, creating a memory leak. When you have a long-running program, such as a system service, creating repeated memory leaks will eventually fill up the entire memory (and speaking from personal experience, this absolutely does happen in practice). But in theory, if you would want to make a manual memory allocation, you would use bool *is_idx_prime = new bool[target]; and then later deallocate again with delete [] is_idx_prime.

Apparent memory leak while using 2D vector in C++

I decided to use C++ 2D vectors for an applications. While testing some code I encountered a bad alloc error. I read that this may be caused by memory shortage so I investigated my program's memory use by calling the malloc_stats function at different steps in gdb.
While I am not sure I completely understand the output of the function, it seems to indicate memory leaks.
I tried to sum up the problem in the short code below:
#include <vector>
#include <iostream>
using namespace std;
vector<vector<double>> get_predictions(){
vector<vector<double>> predictions;
for(int i = 0; i < 10; i++){
vector<double> new_state;
new_state.push_back(600);
new_state.push_back(450);
new_state.push_back(100);
new_state.push_back(200);
predictions.push_back(new_state);
}
return predictions;
}
int main()
{
cout << "start" << endl;
// time loop
for(int i = 0; i < 10; i++){
auto predictions = get_predictions();
// code that uses the latest predictions
}
cout << "end" << endl;
return 0;
}
Now if I call malloc_stats() at the "start" line, the output is similar to this:
Arena 0:
system bytes = 135168
in use bytes = 74352
Total (incl. mmap):
system bytes = 135168
in use bytes = 74352
max mmap regions = 0
max mmap bytes = 0
At the "end" step, the function gives:
Arena 0:
system bytes = 135168
in use bytes = 75568
Total (incl. mmap):
system bytes = 135168
in use bytes = 75568
max mmap regions = 0
max mmap bytes = 0
The "In use bytes" field clearly increased.
Does it really mean that more allocated memory is held ?
If so, why ? Shouldn't the allocated content be freed once the different vectors go out of scope ?
Then, how to avoid such memory issues ?
Thanks a lot
Due to the suggestions in comments, I abandoned malloc_stats, which does not seem to behave as I expected, and tried out Valgrind on my code.
Valgrind confirms that there is no leak in the toy example above.
Concerning my main application, it is a program running on ARM. For reasons unknown to me, the base Valgrind implementation gotten via apt install valgrind did not output the lines concerned by memory leaks. I had to recompile it from sources available on Valgrind website. The problem must come from architecture-specific particularities ?
As a side-note, my application also uses CUDA code, which implies many false positives in Valgrind outputs.

How does memory on the heap get exhausted?

I have been testing out some of my own code to see how much allocated memory it takes to exhaust the memory on the heap or free store. However, unless my code is wrong in the testing of it, I am getting completely different results in terms of how much memory can be put on the heap.
I am testing two different programs. The first program creates vector objects on the heap. The second program creates integer objects on the heap.
Here is my code:
#include <vector>
#include <stdio.h>
int main()
{
long long unsigned bytes = 0;
unsigned megabytes = 0;
for (long long unsigned i = 0; ; i++) {
std::vector<int>* pt1 = new std::vector<int>(100000,10);
bytes += sizeof(*pt1);
bytes += pt1->size() * sizeof(pt1->at(0));
megabytes = bytes / 1000000;
if (i >= 1000 && i % 1000 == 0) {
printf("There are %d megabytes on the heap\n", megabytes);
}
}
}
The final output of this code before getting a bad_alloc error is: "There are 2000 megabytes on the heap"
In the second program:
#include <stdio.h>
int main()
{
long long unsigned bytes = 0;
unsigned megabytes = 0;
for (long long unsigned i = 0; ; i++) {
int* pt1 = new int(10);
bytes += sizeof(*pt1);
megabytes = bytes / 1000000;
if (i >= 100000 && i % 100000 == 0) {
printf("There are %d megabytes on the heap\n", megabytes);
}
}
}
The final output of this code before getting a bad_alloc error is: "There are 511 megabytes on the heap"
The final output in both programs is vastly different. Am I misunderstanding something about the free store? I thought that both results would be about the same.
It is very likely that pointers returned by new on your platform are 16-byte aligned.
If int is 4 bytes, this means that for every new int(10) you're getting four bytes and making 12 bytes unusable.
This alone would explain the difference between getting 500MB of usable space from small allocations and 2000MB from large ones.
On top of that, there's overhead of keeping track of allocated blocks (at a minimum, of their size and whether they're free or in use). That is very much specific to your system's memory allocator but also incurs per-allocation overhead. See "What is a Chunk" in https://sourceware.org/glibc/wiki/MallocInternals for an explanation of glibc's allocator.
First of all you have to understand that operating system assign memory to process in quite large chunks of memory called pages (it is a hardware property). Page size is about 4 -16 kB.
Now standard library try use memory in efficient way. So it have to find a way to chop pages to smaller pieces and manage them. To do that some extra information about heap structure have to be maintained.
Here is cool Andrei Alexandrescu cppcon talk more or less how it works (it omits information about pages management).
So when you allocating lots of small objects information about heap structure is quite large. On other hand if you allocating smaller number of larger objects is more efficient - less memory is waisted on tracking memory structure.
Note also that depending on heap strategy sometimes (when small piece of memory is requested) it is more efficient to waste some memory and return larger size of memory then it was requested.

Memory usage doesn't increase when allocating memory

I would like to find out the amount of bytes used by a process from within a C++ program by inspecting the operating system's memory information. The reason I would like to do this is to find a possible overhead in memory allocation when allocating memory (due to memory control blocks/nodes in free lists etc.) Currently I am on mac and am using this code:
#include <mach/mach.h>
#include <iostream>
int getResidentMemoryUsage() {
task_basic_info t_info;
mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;
if (task_info(mach_task_self(), TASK_BASIC_INFO,
reinterpret_cast<task_info_t>(&t_info),
&t_info_count) == KERN_SUCCESS) {
return t_info.resident_size;
}
return -1;
}
int getVirtualMemoryUsage() {
task_basic_info t_info;
mach_msg_type_number_t t_info_count = TASK_BASIC_INFO_COUNT;
if (task_info(mach_task_self(), TASK_BASIC_INFO,
reinterpret_cast<task_info_t>(&t_info),
&t_info_count) == KERN_SUCCESS) {
return t_info.virtual_size;
}
return -1;
}
int main(void) {
int virtualMemoryBefore = getVirtualMemoryUsage();
int residentMemoryBefore = getResidentMemoryUsage();
int* a = new int(5);
int virtualMemoryAfter = getVirtualMemoryUsage();
int residentMemoryAfter = getResidentMemoryUsage();
std::cout << virtualMemoryBefore << " " << virtualMemoryAfter << std::endl;
std::cout << residentMemoryBefore << " " << residentMemoryAfter << std::endl;
return 0;
}
When running this code I would have expected to see that the memory usage has increased after allocating an int. However when I run the above code I get the following output:
75190272 75190272
819200 819200
I have several questions because this output does not make any sense.
Why hasn't either the virtual/resident memory changed after an integer has been allocated?
How come the operating system is allocating such large amounts of memory to a running process.
When I do run the code and check activity monitor I find that 304 kb of memory is used but that number differs from the virtual/resident memory usage obtained programmatically.
My end goal is to be able to find the memory overhead when assigning data, so is there a way to do this (i.e. determine the bytes used by the OS and compare with the bytes allocated to find the difference is what I am currently thinking of)
Thank you for reading
The C++ runtime typically allocates a block of memory when a program starts up, and then parcels this out to your code when you use things like new, and adds it back to the block when you call delete. Hence, the operating system doesn't know anything about individual new or delete calls. This is also true for malloc and free in C (or C++)
First you measure number of pages, not really memory allocated. Second the runtime pre allocates few pages at startup. If you want to observe something allocate more than a single int. Try allocating several thousands and you will observe some changes.

HeapWalk not working as expected in Release mode

So I used this example of the HeapWalk function to implement it into my app. I played around with it a bit and saw that when I added
HANDLE d = HeapAlloc(hHeap, 0, sizeof(int));
int* f = new(d) int;
after creating the heap then some new output would be logged:
Allocated block Data portion begins at: 0X037307E0
Size: 4 bytes
Overhead: 28 bytes
Region index: 0
So seeing this I thought I could check Entry.wFlags to see if it was set as PROCESS_HEAP_ENTRY_BUSY to keep a track of how much allocated memory I'm using on the heap. So I have:
HeapLock(heap);
int totalUsedSpace = 0, totalSize = 0, largestFreeSpace = 0, largestCounter = 0;
PROCESS_HEAP_ENTRY entry;
entry.lpData = NULL;
while (HeapWalk(heap, &entry) != FALSE)
{
int entrySize = entry.cbData + entry.cbOverhead;
if ((entry.wFlags & PROCESS_HEAP_ENTRY_BUSY) != 0)
{
// We have allocated memory in this block
totalUsedSpace += entrySize;
largestCounter = 0;
}
else
{
// We do not have allocated memory in this block
largestCounter += entrySize;
if (largestCounter > largestFreeSpace)
{
// Save this value as we've found a bigger space
largestFreeSpace = largestCounter;
}
}
// Keep a track of the total size of this heap
totalSize += entrySize;
}
HeapUnlock(heap);
And this appears to work when built in debug mode (totalSize and totalUsedSpace are different values). However, when I run it in Release mode totalUsedSpace is always 0.
I stepped through it with the debugger while in Release mode and for each heap it loops three times and I get the following flags in entry.wFlags from calling HeapWalk:
1 (PROCESS_HEAP_REGION)
0
2 (PROCESS_HEAP_UNCOMMITTED_RANGE)
It then exits the while loop and GetLastError() returns ERROR_NO_MORE_ITEMS as expected.
From here I found that a flag value of 0 is "the committed block which is free, i.e. not being allocated or not being used as control structure."
Does anyone know why it does not work as intended when built in Release mode? I don't have much experience of how memory is handled by the computer, so I'm not sure where the error might be coming from. Searching on Google didn't come up with anything so hopefully someone here knows.
UPDATE: I'm still looking into this myself and if I monitor the app using vmmap I can see that the process has 9 heaps, but when calling GetProcessHeaps it returns that there are 22 heaps. Also, none of the heap handles it returns matches to the return value of GetProcessHeap() or _get_heap_handle(). It seems like GetProcessHeaps is not behaving as expected. Here is the code to get the list of heaps:
// Count how many heaps there are and allocate enough space for them
DWORD numHeaps = GetProcessHeaps(0, NULL);
HANDLE* handles = new HANDLE[numHeaps];
// Get a handle to known heaps for us to compare against
HANDLE defaultHeap = GetProcessHeap();
HANDLE crtHeap = (HANDLE)_get_heap_handle();
// Get a list of handles to all the heaps
DWORD retVal = GetProcessHeaps(numHeaps, handles);
And retVal is the same value as numHeaps, which indicates that there was no error.
Application Verifier had been set up previously to do a full page heap verifying of my executable and was interfering with the heaps returned by GetProcessHeaps. I'd forgotten about it being set up as it was done for a different issue several days ago and then closed without clearing the tests. It wasn't happening in debug build because the application builds to a different file name for debug builds.
We managed to detect this by adding a breakpoint and looking at the callstack of the thread. We could see the AV DLL had been injected in and that let us know where to look.