Segmentation Fault with big input

Segmentation Fault with big input - c++

So I wrote this program in C++ to solve COJ(Caribbean Online Judge) problem 1456. http://coj.uci.cu/24h/problem.xhtml?abb=1456. It works just fine with the sample input and with some other files I wrote to test it but I kept getting 'Wrong Answer' as a veredict, so I decided to try with a larger input file and I got Segmentation Fault:11. The file was 1000001 numbers long without the first integer which is the number of inputs that will be tested. I know that error is caused by something related to memory but I am really lacking more information. Hope anyone can help, it is driving me nuts. I program mainly in Java so I really have no idea how to solve this. :(
#include <stdio.h>
int main(){
long singleton;
long N;
scanf("%ld",&N);
long arr [N];
bool sing [N];
for(int i = 0; i<N; i++){
scanf("%ld",&arr[i]);
}
for(int j = 0; j<N; j++){
if(sing[j]==false){
for(int i = j+1; i<N; i++){
if(arr[j]==arr[i]){
sing[j]=true;
sing[i]=true;
break;
}
}
}
if(sing[j]==false){
singleton = arr[j];
break;
}
}
printf("%ld\n", singleton);
}

If you are writing in C, you should change the first few lines like this:
#include <stdio.h>
#include <stdlib.h>
int main(void){
long singleton;
long N;
printf("enter the number of values:\n");
scanf("%ld",&N);
long *arr;
arr = malloc(N * sizeof *arr);
if(arr == NULL) {
// malloc failed: handle error gracefully
// and exit
}
This will at least allocate the right amount of memory for your array.
update note that you can access these elements with the usual
arr[ii] = 0;
Just as if you had declared the array as
long arr[N];
(which doesn't work for you).

To make it proper C++, you have to convince the standard committee to add Variable length arrays to the language.
To make it valid C, you have to include <stdbool.h>.
Probably your VLA nukes your stack, consuming a whopping 4*1000001 byte. (The bool only adds a quarter to that) Unless you use the proper compiler options, that is probably too much.
Anyway, you should use dynamic memory for that.
Also, using sing without initialisation is ill-advised.
BTW: The easiest C answer for your programming challenge is: Read the numbers into an array (allocated with malloc), sort (qsort works), output the first non-duplicate.

When you write long arr[N]; there is no way that your program can gracefully handle the situation where there is not enough memory to store this array. At best, you might get a segfault.
However, with long *arr = malloc( N * sizeof *arr );, if there is not enough memory then you will find arr == NULL, and then your program can take some other action instead, for example exiting gracefully, or trying again with a smaller number.
Another difference between these two versions is where the memory is allocated from.
In C (and in C++) there are two memory pools where variables can be allocated: automatic memory, and the free store. In programming jargon these are sometimes called "the stack" and "the heap" respectively. long arr[N] uses the automatic area, and malloc uses the free store.
Your compiler and/or operating system combination decide how much memory is available to your program in each pool. Typically, the free store will have access to a "large" amount of memory, the maximum possible that a process can have on your operating system. However, the automatic storage area may be limited in size , as well as having the drawback that if allocation fails then you have to have your process killed or have your process go haywire.
Some systems use one large area and have the automatic area grow from the bottom, and free store allocations grow from the top, until they meet. On those systems you probably wouldn't run out of memory for your long arr[N], although the same drawback remains about not being able to handle when it runs out.
So you should prefer using the free store for anything that might be "large".

Related

c++ pointers used as pseudo-arrays. Assigning number to multiple consecutive positions of the memory without array

It's more of a conversation topic that a question. Look at the following code for calculating the nth Fibonacci number and print out all of them up to the nth(namespace std pressumed and the following is in the main):
int n=20;
int *a=new int; //notice NO size declaration
a[0]=1;
a[1]=1;
for(int i=2; i<n;i++){
a[i]=a[i-1]+a[i-2];
}
for(int i=0; i<n;i++){
cout<<a[i]<<endl;
}
So should it work? Does it work for you? Any comments as to whether it wouldn't work for someone?
Thank you in advance. That's my personal method to allocate memory dynamically in 1D, but I can't find any documentation with this method, and I've been using it forever.
Of course, I don't do the same on 2D.
Thank you for reading.

So should it work?
No. Accessing any index other than a[0] produces undefined behavior, since it's outside the allocated memory.
Does it work for you?
It might appear to work in some cases. When you have undefined behavior, anything can happen; there's no requirement that the error be detected (see Why don't I get a segmentation fault when I write beyond the end of an array?). It's possible that the memory that it uses isn't in use for anything else, so it doesn't cause an obvious failure. But it could also cause corruption of data that's used by some unrelated part of the application or library; you might not notice the problem immediately.
A common symptom of undefined behavior is that seemingly innocuous changes cause unexpected changes in behavior. For instance, the program might work with a print statement in it, but fail when you remove that statement.

The way you are doing this is not "memory safe" because you are only allocating memory for a single integer and accessing subsequent addresses that have not been allocated. This is equivalent to declaring a definite sized array and accessing indexes larger than its size. There are a couple options to do what you want:
Allocating memory manually
Use malloc() to allocate the number of bytes you need. After using it you should use free(). The compiler does not do that for you.
int n=20;
int *a = malloc(sizeof(int) * n);
a[0]=1;
a[1]=1;
for(int i=2; i<n;i++){
a[i]=a[i-1]+a[i-2];
}
free(a);
Using a C++ vector Container
Vectors are dynamically-sized arrays. They can change size automatically and are very useful. Read more here.
int n=20;
std::vector<int> a;
a.push_back(1);
a.push_back(1);
for(int i=2; i<n;i++){
a.push_back(a[i-1]+a[i-2]);
}
Edit: clarify that you need to free() after using malloc().

Segmentation fault in below program

I was trying to solve very basic problem SPOJ CANDY
I am getting a segmentation fault when submitting the below solution.
But in Visual Studio its working fine.
I also declared variables by considering the size (sum as long long int)
because it can be large
1) Is it due to the fact that I am declaring the array inside the while loop;
should I declare that array outside of while loop so that for every test cases it uses that same array
2) Is every time loop runs(for every test cases) the new array is created, will it lead to garbage collection or compiler will automatically free the memory after every test cases (I know about dynamic memory allocation in that case we have to free memory explicitly ) can you tell me in which scope I
should declare the variables?
I got above doubts because segmentation fault is regarding memory access.
#include<iostream>
using namespace std;
int main(){
while(1){
int n;
int arr[10001];
cin>>n;
if(n==-1)
break;
long long int sum=0;
for(int i=0;i<n;i++){
int temp;
cin>>temp;
sum+=temp;
arr[i]=temp;
}
int mean=sum/n;
if((sum%n)!=0){
cout<<-1<<endl;
continue;
}
int count1=0;
for(int i=0;i<n;i++){
if(arr[i]>mean){
count1+=(arr[i]-mean);
}
}
cout<<count1<<endl;
}
}

Your problem is probably due to the stack allocation of int arr[10001]. This is most probably a 40kB allocation. Now, "allocation" is the wrong word, as it essentially just calculates the address of arr by doing something like int * arr = STACK_POINTER-40004.
Unfortunately, it is common to have the maximum stack size be 12 kB by default. This means that the operating system maps 12 kB into memory and sets STACK_POINTER to the top of that memory (assuming the stack grows downward).
So the net effect is that your arr pointer now points beyond the allocated stack -- into unallocated memory -- and the first access throws a segmentation fault. Normally you could fix this by upping the stack size with ulimit -s, but you do not have control over the judging platform used.
You have two options:
use a heap allocation instead int *arr = new int[10001]. This is not affected by the initial stack size. In a normal program you should take care to clean this up, but for a short program like this it is not necessary.
move the declaration of int arr[10001] to the top level. arr will point to a region known as the BSS section, which is initially zeroed. This is also not affected by the initial stack size.

why cpp allows to get a access to memory i haven't allocated?

In cpp one can use an array declaration as
typename array[size];
or
typename *array = new typename[size];
Where array is of length 'size' and elements are indexed from '0' to 'size -1'
Here my question is am I allowed to access the elements beyond the index >= size.
So I wrote this little code to check it
#include <iostream>
using namespace std;
int main()
{
//int *c; //for dynamic allocation
int n; //length of the array c
cin>>n; //getting the length
//c = new int[n]; //for dynamic allocation
int c[n]; //for static allocation
for(int i=0; i<n; i++) //getting the elements
cin>>c[i];
for(int i=0; i<n+10; i++) //showing the elements, I have add up 10
cout<<c[i]<<" "; //with size to access the memory I haven't
//allocated for
return 0;
}
And the result is like this
2
1 2
1 2 2686612 1970422009 7081064 4199040 2686592 0 1 1970387429 1971087432 2686700
Shouldn't the program crashed but gives garbage values. And for both the allocation methods it gives the same result. It makes more bugs which are hard to detect. Is it related with the environment or the compiler I am using or anything else?
I was using codeblocks IDE having TDM-GCC 4.8.1 compiler on windows 8.1
Thanks in advance.

This is called "undefined behavior" in the C++ standard.
Undefined behavior can mean any one of the following:
The program crashes
The program continues to run, but produces meaningless, garbage results
The program continues to run, and automatically copies the entire contents of your hard drive, and posts it on Facebook
The program continues to run, and automatically subscribes you to Publishers Clearinghouse Sweepstakes
The program continues to run, but your computer catches fire and explodes
The program continues to run, and makes your computer self-aware, which automatically links and networks with other self-aware networks, forming Skynet, and destroying the human race
Conclusion: do not run and access elements past the end of your arrays.

The c++ compilers don't enforce this as there is no specification to do so.
When you access an element of an array there is no boundary check done. c[i] just gets translated to c + i * sizeof(int) and that's it. If that area of memory is not initialize you'll get garbage, but you could be getting other useful information it all depends on what is there.
Please note that depending on the OS and the c++ runtime you're running you can get different results, for instance on a linux box you'll probably be getting a segmentation fault and the program will crash.

Dynamic Allocation of memory c++ performance improvements

Good night
I'm learning C++ for scientific code development. Differently of my learning strategy used in python and MATLAB, I'm trying to learn C++ in old and deep way (using a book ref[1]). I'm struggling to understand pointers and dynamic memory allocation.
I came across this exercise and some doubts appeared. The exercise is a simple 2x2 matrix multiplication. The code:
#include<iostream>
//Perform C[2][2] = A[2][2]*B[2][2] Using dynamic memory allocation
int main(){
//declaring length of the matrices
int row,col;
row = 2;
col = 2;
//Allocating memory for matrice A [heap ?]
double** A, **B, **C;
A = new double* [row];
B = new double* [row];
C = new double* [row];
for (int i=0; i<row; ++i){
A[i] = new double [col];
B[i] = new double [col];
C[i] = new double [col];
}
//Performing calculation
for (int i=0; i<2; ++i){
for (int j=0; j<2; ++j){
A[i][j] += 10;
B[i][j] += 20;
C[i][j] += A[i][j]*B[i][j];
std::cout << C[i][j] << " ";
}
std::cout << std::endl;
}
//Deleting memory allocation [Is there an memory leak ? (No *=NULL)]
for (int i=0; i<2; ++i){
delete[] A[i];
delete[] B[i];
delete[] C[i];
}
delete[] A;
delete[] B;
delete[] C;
return 0;
}
My doubts are:
1 - I read about the memory division (global/stack/heap), where are they located in the hardware ? HD/RAM/CPU cache ?
2 - when I allocate an array:
int arrayS[2]; //Is this at the stack ?
int* arrayH; //Is this at the heap ?
arrayH = new int [2];
3 - In my resolution of the matrix problem. Is there a memory leak or garbage creation ? (note I didn't pointed the arrays to *=NULL to get rid of the address)
4 - Do you suggest any way to improve my code performance and efficiency ?
Thank you !

1) The global area, stack and heap are all located within your application's CPU address space. Which will mean they're in RAM, subject to virtual memory paging (in which case they may go and live on the HD for a while) and CPU caching (in which case they may go and live on the CPU for a while). But both of those things are transparent as far as the generated code is concerned.
2) yes, arrayS[2] will be on the stack (unless it's global, of course), and anything returned by new (or malloc, should you include some C code) is on the heap.
3) I can't see any leaks but if you're going to use row and col rather than repeating the 2 magic constant all over the place then do so uniformly and mark them as const so that you can't accidentally modify them.
4) in terms of cache efficiency, it may be better to do one allocation as new double[col * row] and then either to spread your pointers out throughout that block or to index as [i*col + j], assuming i indexes rows and all rows are col items long. new is otherwise permitted to spread your rows out wherever it likes across memory, which is more likely to lead to cache faults.
As to style? Well, you didn't ask, but I think cdhowie's comment is valid. Also Deduplicator's point should be read more deeply: the STL has a bunch of pieces to help make sure you don't leak memory — read up on unique_ptr and shared_ptr.

I read about the memory division (global/stack/heap), where are they located in the hardware ? HD/RAM/CPU cache ?
They can all be stored wherever your particular C++ program decides that they should be. Some local variables are likely to exist only in registers, if they have a short life span and never have a pointer taken to them.
Odds are that everything else will wind up somewhere in system RAM (which can be paged out to disk, if your process is unlucky). The difference is how they are used by the program, not where they happen to be. Where they are is an implementation detail that you don't really need to worry about.
when I allocate an array
Yes, your analysis of stack vs. heap there is correct. Your first one (int arrayS[2];) has the caveat that if you declare this variable as part of a class or struct, it could exist either place depending on how the class/struct gets created. (If the class/struct gets created on the heap then the array would be on the heap; if it gets created on the stack then on the stack.)
And of course if it's a global then it has global storage.
Do you suggest any way to improve my code performance and efficiency ?
Encapsulate the concept of a matrix inside of a class, and use std::vector instead of allocating your own arrays. The amount of indirection will be the same after compiler optimization, but you no longer have to concern yourself with managing the memory used by the arrays.

In c++ the object can be located stack and in heap.In your example int arrayS[2] is located in the stack of the funcion in which it was declared.If you want to create object in heap (a memory that is available to all the functions that exists in the program ) you must use the operator new.The pointer that keeps the address of the object allocated (in your case arrayH) is also push in the stack of the function in which it is declared
Actually memory of one process(program that is being executable) is dived in three parts:
Code Area (is the area in which the code of the program is located );
The stack area (is starts at the location where the Stack-Pointer points in memory.Stack pointer is a register that keeps every time the address of the current stack.In a program more that one stack at one time,this depends on the level of recursivity of your program)
The data area (is the global memory accessible from entire program.It is allocated in c++ only with new Operator or as global variable)
It exists also shared_memory which permits to the programmer to allocate memory available in more than one process.
**There is no memory leak in your code.But in c++ are implemented the concept of smart_pointer which is very similar with the garabage-colector from C# and Java **
Also the C++ is an OOP programming language so you definitely should write your code using classes ,inheritance,polimorfism etc.

Memory leak with dynamic array of mpfr variables in c++

I have a simple c++ program using the multiprecision library MPFR written to try and understand a memory problem in a bigger program:
int main() {
int prec=65536, size=1, newsize=1;
mpfr_t **mf;
while(true) {
size=newsize;
mf=new mpfr_t*[size];
for(int i=0;i<size;i++) {
mf[i]=new mpfr_t[size];
for(int j=0;j<size;j++) mpfr_init2(mf[i][j], prec);
}
cout << "Size of array: ";
cin >> newsize;
for(int i=0;i<size;i++) {
for(int j=0;j<size;j++) mpfr_clear(mf[i][j]);
delete [] mf[i];
}
delete [] mf;
}
}
The point here is to declare arrays of different sizes and monitor the memory usage with task manager (I'm using Windows). This works fine for sizes ~< 200 but if I declare something larger the memory doesn't seem to be freed up when I decrease the size again.
Here's an example run:
I start the program and choose size 50. Then I change sizes between 50, 100, 150 and 200 and see the memory usage go up and down as expected. I then choose size 250 and the memory usage goes up as expected but when I go back to 200 it doesn't decrease but increases to something like the sum of the memory values needed for size 200 and 250 respectively. A similar behaviour is seen with bigger sizes.
Any idea what's going on?

Process Explorer will give you a more realistic view of your process's memory usage (Virtual Size) than Task Manager will. A memory leak is when a program doesn't free memory is should and if this happens all the time it's memory will never stop increasing.
Windows won't necessarily free your program's memory back to the system itself - and so task manager etc won't tell you the whole truth.
To detect memory leaks in visual studio you can enable the _CRTDBG_MAP_ALLOC macro, as described on this MSDN page.
Also this question talks a bit about making it work with C++ new keyword.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js