Segmentation fault while accessing particular array location/index - c++

I used the malloc function for allocating 10^9 memory locations as part of an array.
The Code 1 gets executed successfully.
Code 1:
int main(){
int size = (int)(1e9);
int* arr = (int*) malloc( size * sizeof(int) );
for(int i=0;i<size;i++){
arr[i] = i;
}
return 0;
}
But when I tried to access the particular memory location or index value = 12345678 (which is < 1e9), I got segmentation fault
Code 2:
int main(){
int size = (int)(1e9);
int* arr = (int*) malloc( size * sizeof(int) );
for(int i=0;i<size;i++){
arr[i] = i;
}
cout<<arr[12345678]<<endl; //added this line of code, which gives segmentation fault
return 0;
}
My guess is, this occurs due to memory fragmentation, but I am not sure about this. Can anyone kindly explain the correct reason.

It's definitely
Compiler optimisation
Limit on memory allocation as per standard sandbox

Modern operating systems are doing something called "lazy allocation", this means that when you ask to allocate memory, they give it to you without actually allocating the memory.
This is exactly the reason for the Virtual memory vs Physical memory taken you see while using top.
The memory is only really allocated when you try to use the buffer itself.
In your example, you've allocated a lot of memory (4GB), the OS let you allocate it but did not actually free up the memory needed for it, once you've tried using it, it tried to actually provide the memory you asked for, couldn't do it, and therefore crashed.

Related

Segmentation fault in below program

I was trying to solve very basic problem SPOJ CANDY
I am getting a segmentation fault when submitting the below solution.
But in Visual Studio its working fine.
I also declared variables by considering the size (sum as long long int)
because it can be large
1) Is it due to the fact that I am declaring the array inside the while loop;
should I declare that array outside of while loop so that for every test cases it uses that same array
2) Is every time loop runs(for every test cases) the new array is created, will it lead to garbage collection or compiler will automatically free the memory after every test cases (I know about dynamic memory allocation in that case we have to free memory explicitly ) can you tell me in which scope I
should declare the variables?
I got above doubts because segmentation fault is regarding memory access.
#include<iostream>
using namespace std;
int main(){
while(1){
int n;
int arr[10001];
cin>>n;
if(n==-1)
break;
long long int sum=0;
for(int i=0;i<n;i++){
int temp;
cin>>temp;
sum+=temp;
arr[i]=temp;
}
int mean=sum/n;
if((sum%n)!=0){
cout<<-1<<endl;
continue;
}
int count1=0;
for(int i=0;i<n;i++){
if(arr[i]>mean){
count1+=(arr[i]-mean);
}
}
cout<<count1<<endl;
}
}
Your problem is probably due to the stack allocation of int arr[10001]. This is most probably a 40kB allocation. Now, "allocation" is the wrong word, as it essentially just calculates the address of arr by doing something like int * arr = STACK_POINTER-40004.
Unfortunately, it is common to have the maximum stack size be 12 kB by default. This means that the operating system maps 12 kB into memory and sets STACK_POINTER to the top of that memory (assuming the stack grows downward).
So the net effect is that your arr pointer now points beyond the allocated stack -- into unallocated memory -- and the first access throws a segmentation fault. Normally you could fix this by upping the stack size with ulimit -s, but you do not have control over the judging platform used.
You have two options:
use a heap allocation instead int *arr = new int[10001]. This is not affected by the initial stack size. In a normal program you should take care to clean this up, but for a short program like this it is not necessary.
move the declaration of int arr[10001] to the top level. arr will point to a region known as the BSS section, which is initially zeroed. This is also not affected by the initial stack size.

How to free a dynamically allocated memory to an array inside a struct?

I'm trying to free the memory of the allocated array inside struct _Stack, but the program keeps crashing
typedef struct _Stack
{
int top;
unsigned int capacity;
int* arr;
}_Stack;
_Stack* createStack(int capacity)
{
_Stack* stack = (_Stack*) malloc(sizeof(_Stack));
stack->capacity = capacity;
stack->top = -1;
stack->arr = (int*) malloc(sizeof(stack->capacity * sizeof(int)));
return stack;
}
I'm using this function to free the memory, but the program crashes here.
// I have a problem here.
void stack_free(_Stack* stack)
{
free(stack->arr);
free(stack);
}
Change this:
stack->arr = (int*) malloc(sizeof(stack->capacity * sizeof(int)));
to this:
stack->arr = (int*) malloc(stack->capacity * sizeof(int));
since you want the size of the array to be equal to stack->capacity * sizeof(int), and not equal to the size of that expression.
Your program must have invoked Undefined Behavior somewhere in code not shown in the question (because of the wrong size malloc'ed), that's why it crashes later.
PS: Since you use C++, consider using new instead (and delete, instead of free()).
sizeof(stack->capacity * sizeof(int)) in your call to malloc is wrong. Instead of the size of the array, it gives the size of the number used to represent the size of the array. You probably want stack->capacity * sizeof(int).
Another possible problem is that in C you should not cast the return value of malloc, since it can hide other mistakes and cause a crash. See Do I cast the result of malloc?
In C++ you will have to do it, because of the stricter type checking in C++, but it can still hide problems.
These are the problems I see with the code you have shown. However, remember that errors in malloc and free are not necessarily caused by the actual line where they are detected. If some part of your program damages the malloc system's internal data structures, for example by a buffer overrun, the problem can manifest in a much later call to malloc or free, in a completely different part of the program.

Only able to allocate limited memory using new operator in CUDA

I wrote a cuda kernel like this
__global__ void mykernel(int size; int * h){
double *x[size];
for(int i = 0; i < size; i++){
x[i] = new double[2];
}
h[0] = 20;
}
void main(){
int size = 2.5 * 100000 // or 10,000
int *h = new int[size];
int *u;
size_t sizee = size * sizeof(int);
cudaMalloc(&u, sizee);
mykernel<<<size, 1>>>(size, u);
cudaMemcpy(&h, &u, sizee, cudaMemcpyDeviceToHost);
cout << h[0];
}
I have some other code in the kernel too but I have commented it out. The code above it also allocates some more memory.
Now when I run this with size = 2.5*10^5 I get h[0] value to be 0;
When I run this with size = 100*100 I get h[0] value to be 20;
So I am guessing that my kernels are crashing cause I am running out of memory. I am using a Tesla card C2075 which has ram 2GB ! I even tried this by shutting down the xserver. What I am working on is not even 100mb of data.
How can I allocate more memory to each block?
Now when I run this with size = 2.5*10^5 I get h[0] value to be 0;
When I run this with size = 100*100 I get h[0] value to be 20;
In your kernel launch, you are using this size variable also:
mykernel<<<size, 1>>>(size, u);
^^^^
On a cc2.0 device (Tesla C2075), this particular parameter in the 1D case is limited to 65535. So 2.5*10^5 exceeds 65535, but 100*100 does not. Therefore, your kernel may be running if you specify size of 100*100, but is probably not running if you specify size of 2.5*10^5.
As already suggested to you, proper cuda error checking should point this error out to you, and in general will probably result in you needing to ask far fewer questions on SO, as well as posting higher-quality questions on SO. Take advantage of the CUDA runtime's ability to let you know when things have gone wrong and when you are making a mistake. Then you won't be in a quandary, thinking you have a memory allocation problem when in fact you probably have a kernel launch configuration problem.
How can I allocate more memory to each block?
Although it is probably not your main issue (as indicated above), in-kernel new and malloc are limited to the size of the device heap. Once this has been exhausted, further calls to new or malloc will return a null pointer. If you use this null pointer anyway, your kernel code will begin to perform unspecified behavior, and will likely crash.
When using new and malloc, especially when you're having trouble, it's good practice to check for a null return value. This applies to both host (at least for malloc) and device code.
The size of the device heap is pretty small to begin with (8MB), but it can be modified.
Referring to the documentation:
The device memory heap has a fixed size that must be specified before any program using malloc() or free() is loaded into the context. A default heap of eight megabytes is allocated if any program uses malloc() without explicitly specifying the heap size.
The following API functions get and set the heap size:
•cudaDeviceGetLimit(size_t* size, cudaLimitMallocHeapSize)
•cudaDeviceSetLimit(cudaLimitMallocHeapSize, size_t size)
The heap size granted will be at least size bytes. cuCtxGetLimit()and cudaDeviceGetLimit() return the currently requested heap size.

Segmentation Fault with big input

So I wrote this program in C++ to solve COJ(Caribbean Online Judge) problem 1456. http://coj.uci.cu/24h/problem.xhtml?abb=1456. It works just fine with the sample input and with some other files I wrote to test it but I kept getting 'Wrong Answer' as a veredict, so I decided to try with a larger input file and I got Segmentation Fault:11. The file was 1000001 numbers long without the first integer which is the number of inputs that will be tested. I know that error is caused by something related to memory but I am really lacking more information. Hope anyone can help, it is driving me nuts. I program mainly in Java so I really have no idea how to solve this. :(
#include <stdio.h>
int main(){
long singleton;
long N;
scanf("%ld",&N);
long arr [N];
bool sing [N];
for(int i = 0; i<N; i++){
scanf("%ld",&arr[i]);
}
for(int j = 0; j<N; j++){
if(sing[j]==false){
for(int i = j+1; i<N; i++){
if(arr[j]==arr[i]){
sing[j]=true;
sing[i]=true;
break;
}
}
}
if(sing[j]==false){
singleton = arr[j];
break;
}
}
printf("%ld\n", singleton);
}
If you are writing in C, you should change the first few lines like this:
#include <stdio.h>
#include <stdlib.h>
int main(void){
long singleton;
long N;
printf("enter the number of values:\n");
scanf("%ld",&N);
long *arr;
arr = malloc(N * sizeof *arr);
if(arr == NULL) {
// malloc failed: handle error gracefully
// and exit
}
This will at least allocate the right amount of memory for your array.
update note that you can access these elements with the usual
arr[ii] = 0;
Just as if you had declared the array as
long arr[N];
(which doesn't work for you).
To make it proper C++, you have to convince the standard committee to add Variable length arrays to the language.
To make it valid C, you have to include <stdbool.h>.
Probably your VLA nukes your stack, consuming a whopping 4*1000001 byte. (The bool only adds a quarter to that) Unless you use the proper compiler options, that is probably too much.
Anyway, you should use dynamic memory for that.
Also, using sing without initialisation is ill-advised.
BTW: The easiest C answer for your programming challenge is: Read the numbers into an array (allocated with malloc), sort (qsort works), output the first non-duplicate.
When you write long arr[N]; there is no way that your program can gracefully handle the situation where there is not enough memory to store this array. At best, you might get a segfault.
However, with long *arr = malloc( N * sizeof *arr );, if there is not enough memory then you will find arr == NULL, and then your program can take some other action instead, for example exiting gracefully, or trying again with a smaller number.
Another difference between these two versions is where the memory is allocated from.
In C (and in C++) there are two memory pools where variables can be allocated: automatic memory, and the free store. In programming jargon these are sometimes called "the stack" and "the heap" respectively. long arr[N] uses the automatic area, and malloc uses the free store.
Your compiler and/or operating system combination decide how much memory is available to your program in each pool. Typically, the free store will have access to a "large" amount of memory, the maximum possible that a process can have on your operating system. However, the automatic storage area may be limited in size , as well as having the drawback that if allocation fails then you have to have your process killed or have your process go haywire.
Some systems use one large area and have the automatic area grow from the bottom, and free store allocations grow from the top, until they meet. On those systems you probably wouldn't run out of memory for your long arr[N], although the same drawback remains about not being able to handle when it runs out.
So you should prefer using the free store for anything that might be "large".

Memory management in C++.

I have the following program:
//simple array memory test.
#include <iostream>
using namespace std;
void someFunc(float*, int, int);
int main() {
int convert = 2;
float *arr = new float[17];
for(int i = 0; i < 17; i++) {
arr[i] = 1.0;
}
someFunc(arr, 17, convert);
for(int i = 0; i < 17; i++) {
cout << arr[i] << endl;
}
return 0;
}
void someFunc(float *arr, int num, int flag) {
if(flag) {
delete []arr;
}
}
When I put the following into gdb and insert a break point at float *arr ..., I step through the program and observe the following:
Printing the array arr after it has been initialized gives me 1 17 times.
Inside someFunc too, I print arr before delete to get the same print as above.
Upon going back into main, when I print arr, I get the first digit as 0 followed by 16 1.0s.
My questions:
1. Once the array has been deleted in someFunc, how am I still able to access arr without a segfault in someFunc or main?
2. The code snippet above is a test version of another piece of code that runs in a bigger program. I observe the same behaviour in both places (first digit is 0 but all others are the same. If this is some unexplained memory error, how am I observing the same thing in different areas?
3. Some explanations to fill the gaps in my understanding are most welcome.
A segfault occurs when you access a memory address that isn't mapped into the process. Calling delete [] releases memory back to the memory allocator, but usually not to the OS.
The contents of the memory after calling delete [] are an implementation detail that varies across compilers, libraries, OSes and especially debug-vs-release builds. Debug memory allocators, for instance, will often fill the memory with some tell-tale signature like 0xdeadbeef.
Dereferencing a pointer after it has been deleteed is undefined behavior, which means that anything can happen.
Once the array has been deleted, any access to it is undefined behavior.
There's no guarantee that you'll get a segment violation; in fact,
typically you won't. But there's no guarantee of what you will get; in
larger programs, modifying the contents of the array could easily result
in memory corruption elsewhere.
delete gives the memory back to the OS memory manager, but does not necessarily clears the contents in the memory(it should not, as it causes overhead for nothing). So the values are retained in the memory. And in your case, you are accessing the same memory -- so it will print what is in the memory -- it is not necessarily an undefined behaviour(depends on memory manager)
Re 1: You can't. If you want to access arr later, don't delete it.
C++ doesn't check for array boundaries. Only if you access a memory which you are not allowed to you will get segfault