Segmentation fault while reading a large array from a file. C++/gcc

Segmentation fault while reading a large array from a file. C++/gcc - c++

In the following code I'm trying to find the frequencies of the rows in fileA which have the same value on the second column. (each row has two column and both are integers.) Sample of fileA:
1 22
8 3
9 3
I have to write the output in fileB like this:
22 1
3 2
Because element 22 has been repeated once in second column(and 3 repeated 2 times.)
fileA is very large(30G). And there are 41,000,000 elements in it(in other words, fileB has 41,000,000) rows. This is the code that I wrote:
void function(){
unsigned long int size = 41000000;
int* inDeg = new int[size];
for(int i=0 ; i<size; i++)
{
inDeg[i] = 0;
}
ifstream input;
input.open("/home/fileA");
ofstream output;
output.open("/home/fileB");
int a,b;
while(!input.eof())
{
input>>a>>b;
inDeg[b]++; //<------getting error here.
}
input.close();
for(int i=0 ; i<size; i++)
{
output<<i<<"\t"<<inDeg[i]<<endl;
}
output.close();
delete[] inDeg;
}
I'm facing segmentation fault error on the second line of the while loop. On the 547387th iteration. I have already assigned 600M to the stack memory based on this. I'm using gcc 4.8.2 (on Mint17 x86_64).
Solved
I analysed fileA thoroughly. The reason of the problem as hyde mentioned wasn't with hardware. Segfault reason was wrong indexing. Changing the size to 61,500,000 solved my problem.

In the statement:
while(!input.eof())
{
input>>a>>b;
inDeg[b]++;
}
Is b the index of your array?
When you read in the values:
1 22
You are discarding the 1 and incrementing the value at slot 22 in your array.
You should check the range of b before incrementing the value at inDeg[b]:
while (input >> a >> b)
{
if ((b >= 0) && (b < size))
{
int c = inDeg[b];
++c;
inDeg[b] = c;
}
else
{
std::cerr << "Index out of range: " << b << "\n";
}
}

You are allocating a too huge array in to the heap. It´s a memory thing, your heap cant take that much space.
You should split your in and output in smaller parts, so at example create a for loop which goes every time 100k , deletes them and then does the next 100k.
in such cases try a exception handling, this is a example snippet how to manage exception checking for too huge arrays:
int ii;
double *ptr[5000000];
try
{
for( ii=0; ii < 5000000; ii++)
{
ptr[ii] = new double[5000000];
}
}
catch ( bad_alloc &memmoryAllocationException )
{
cout << "Error on loop number: " << ii << endl;
cout << "Memory allocation exception occurred: "
<< memmoryAllocationException.what()
<< endl;
}
catch(...)
}
cout << "Unrecognized exception" << endl;
{

Related

Why does the code throw illegal memory access error

I want to know why the error appears when the code just gets into the function '
sort'
I made some check points using standard output. So I know where the error occurs.
I use repl.it to build this code
...
/*return pivot function*/
int partition(...){
...
}
void sort(vector<int> array, int left, int right){\
/*********"sort start" string dose not appear in console***********/
cout << "sort start";
// one element in array
if(left == right);
// two elements in array
else if( left +1 == right){
if(array.at(0) > array.at(1)){
int temp;
swap(array.at(0),array.at(1),temp);
}
}
// more then 3 elements in array
else{
int p = partition(array,left,right);
sort(array,left,p-1);
sort(array,p+1,right);
}
}
int main() {
vector<int> array;
array.push_back(1);
array.push_back(2);
array.push_back(3);
array.push_back(4);
cout << "array is ";
for(int i = 0 ; i < array.size(); i++){
cout << array.at(i) << " ";
}
cout << endl;
sort(array,0,array.size()-1);/***************sort is here*************/
cout << "sorting..." << endl;
cout << "array is ";
for(int i = 0 ; i < array.size(); i++){
cout << array.at(i) << " ";
}
return 0;
}
When I run this code console output is
array is 4 3 2 2
terminate called after throwing an instance of 'std::out_of_range'
what(): vector::_M_range_check: __n (which is 18446744073709551615) >=
this->size() (which is 4)
exited, aborted
But what I expected is
array is 4 3 2 2
sorting...
sort start
array is 2 2 3 4

You are trying to access an element at index -1, which, when converted into 64-bit unsigned value, is 18446744073709551615. Live demo: https://wandbox.org/permlink/jAhJZS3ANjkDDOUr.
There are multiple problems with your code and, first of all, you don't show us the definition of your partition and swap. Moreover, your code does not match the provided output (1 2 3 4 vs 4 3 2 2 in the first line).
Anyway, one of the problems is that you don't check for cases where left is higher than right. That can easily happen. Consider that in the very first call of sort, partition returns 0 (pivot position). Then, you call:
sort(array, left, p - 1);
which turns into
sort(array, 0, -1);
That's where negative indexes can be generated.

Program only works with inclusion of (side effects free) cout statements?

So I've been working on problem 15 from the Project Euler's website , and my solution was working great up until I decided to remove the cout statements I was using for debugging while writing the code. My solution works by generating Pascal's Triangle in a 1D array and finding the element that corresponds to the number of paths in the NxN lattice specified by the user. Here is my program:
#include <iostream>
using namespace std;
//Returns sum of first n natural numbers
int sumOfNaturals(const int n)
{
int sum = 0;
for (int i = 0; i <= n; i++)
{
sum += i;
}
return sum;
}
void latticePascal(const int x, const int y, int &size)
{
int numRows = 0;
int sum = sumOfNaturals(x + y + 1);
numRows = x + y + 1;
//Create array of size (sum of first x + y + 1 natural numbers) to hold all elements in P's T
unsigned long long *pascalsTriangle = new unsigned long long[sum];
size = sum;
//Initialize all elements to 0
for (int i = 0; i < sum; i++)
{
pascalsTriangle[i] = 0;
}
//Initialize top of P's T to 1
pascalsTriangle[0] = 1;
cout << "row 1:\n" << "pascalsTriangle[0] = " << 1 << "\n\n"; // <--------------------------------------------------------------------------------
//Iterate once for each row of P's T that is going to be generated
for (int i = 1; i <= numRows; i++)
{
int counter = 0;
//Initialize end of current row of P's T to 1
pascalsTriangle[sumOfNaturals(i + 1) - 1] = 1;
cout << "row " << i + 1 << endl; // <--------------------------------------------------------------------------------------------------------
//Iterate once for each element of current row of P's T
for (int j = sumOfNaturals(i); j < sumOfNaturals(i + 1); j++)
{
//Current element of P's T is not one of the row's ending 1s
if (j != sumOfNaturals(i) && j != (sumOfNaturals(i + 1)) - 1)
{
pascalsTriangle[j] = pascalsTriangle[sumOfNaturals(i - 1) + counter] + pascalsTriangle[sumOfNaturals(i - 1) + counter + 1];
cout << "pascalsTriangle[" << j << "] = " << pascalsTriangle[j] << '\n'; // <--------------------------------------------------------
counter++;
}
//Current element of P's T is one of the row's ending 1s
else
{
pascalsTriangle[j] = 1;
cout << "pascalsTriangle[" << j << "] = " << pascalsTriangle[j] << '\n'; // <---------------------------------------------------------
}
}
cout << endl;
}
cout << "Number of SE paths in a " << x << "x" << y << " lattice: " << pascalsTriangle[sumOfNaturals(x + y) + (((sumOfNaturals(x + y + 1) - 1) - sumOfNaturals(x + y)) / 2)] << endl;
delete[] pascalsTriangle;
return;
}
int main()
{
int size = 0, dim1 = 0, dim2 = 0;
cout << "Enter dimension 1 for lattice grid: ";
cin >> dim1;
cout << "Enter dimension 2 for lattice grid: ";
cin >> dim2;
latticePascal(dim1, dim2, size);
return 0;
}
The cout statements that seem to be saving my program are marked with commented arrows. It seems to work as long as any of these lines are included. If all of these statements are removed, then the program will print: "Number of SE paths in a " and then hang for a couple of seconds before terminating without printing the answer. I want this program to be as clean as possible and to simply output the answer without having to print the entire contents of the triangle, so it is not working as intended in its current state.

There's a good chance that either the expression to calculate the array index or the one to calculate the array size for allocation causes undefined behaviour, for example, a stack overflow.
Because the visibility of this undefined behaviour to you is not defined the program can work as you intended or it can do something else - which could explain why it works with one compiler but not another.
You could use a vector with vector::resize() and vector::at() instead of an array with new and [] to get some improved information in the case that the program aborts before writing or flushing all of its output due to an invalid memory access.
If the problem is due to an invalid index being used then vector::at() will raise an exception which you won't catch and many debuggers will stop when they find this pair of factors together and they'll help you to inspect the point in the program where the problem occurred and key facts like which index you were trying to access and the contents of the variables.
They'll typically show you more "stack frames" than you expect but some are internal details of how the system manages uncaught exceptions and you should expect that the debugger helps you to find the stack frame relevant to your problem evolving so you can inspect the context of that one.

Your program works well with g++ on Linux:
$ g++ -o main pascal.cpp
$ ./main
Enter dimension 1 for lattice grid: 3
Enter dimension 2 for lattice grid: 4
Number of SE paths in a 3x4 lattice: 35
There's got to be something else since your cout statements have no side effects.
Here's an idea on how to debug this: open 2 visual studio instances, one will have the version without the cout statements, and the other one will have the version with them. Simply do a step by step debug to find the first difference between them. My guess is that you will realize that the cout statements have nothing to do with the error.

C++ program only lists last value entered into an array

I am trying to output the values present in the array, that are accepted during runtime, onto the console. But when I run this program I get the 5 values in the array as the last value only.
For example: if i give 0 1 2 3 4 as the five values for this program then the output is shown as 4 4 4 4 4.
#include "stdafx.h"
#include<iostream>
using namespace std;
int main()
{
int arrsize = 5;
int *ptr = new int[arrsize];
*ptr = 7;
cout << *ptr << endl;
cout << "enter 5 values:";
for (int i = 0; i < arrsize; i++)
{
cin >> *ptr;
cin.get();
}
cout << "the values in the array are:\n ";
for (int i = 0; i < arrsize; i++)
{
cout << *ptr << " ";
}
delete[] ptr;
cin.get();
return 0;
}

Both of your loops:
for (int i = 0; i < arrsize; i++)
...
loop over a variable i that is never used inside the loop. You are always using *ptr which refers always to the first element of the dynamically allocated array. You should use ptr[i] instead.
A part from that, dynamic allocation is an advanced topic. I'd recommend sticking with simpler and more commonly used things first:
std::cout << "Enter values:";
std::vector<int> array(std::istream_iterator<int>(std::cin), {});
std::cout << "\nThe values in the array are:\n";
std::copy(begin(array), end(array), std::ostream_iterator<int>(std::cout, " "));
Live demo

Following issues I think you could tackle:
The first include can be omitted I think. Your code works without that.
You use cin.get(), not sure why you need that. I think you can remove that. Even the one at the very end. You could put a cout << endl for the last newline. I am using Linux.
And use ptr like an array with index: ptr[i] in the loops as mentioned in the other answer. ptr[i] is equivalent to *(ptr+i). You have to offset it, otherwise you're overwriting the same value (that is why you get that result), because ptr points to the first element of the array.
P.S.: It seems that if you're using Windows (or other systems) you need the cin.get() to avoid the console to close down or so. So maybe you'd need to check it. See comments below.

Crash in a C++ tutorial program (reading matrix)

I am new to using C++ and Microsoft Visual Studio and I trying to convert a data file (2 columns by 500 rows consisting of floats) into an array, and then I am trying to output the array on screen. Whenever I try to run it, it comes up with "Unhandled exception at 0x001BC864 in file.exe: 0xC0000005: Access violation writing location 0x00320A20."
I found this video and tried to adapt that code https://www.youtube.com/watch?v=4nz6rPzVm70
Any help would be appreciated.
#include<stdio.h>
#include<string>
#include<iostream> //
#include<fstream> //
#include<array> //
#include<iomanip> //
#include<sstream>//
#include<conio.h>
using namespace std;
int rowA = 0; //
int colA = 0;
int main()
{
string lineA;
int x;
float arrayA[2][500] = { { 0 } };
ifstream myfile("C:/results.dat");
if (myfile.fail()) {
cerr << "File you are trying to access cannot be found or opened";
exit(1);
}
while (myfile.good()) {
while (getline(myfile, lineA)) {
istringstream streamA(lineA);
colA = 0;
while (streamA >> x) {
arrayA[rowA][colA] = x;
colA++; }
rowA++; }
}
cout << "# of Rows --->" << rowA << endl;
cout << "# of Columns --->" << colA << endl;
cout << " " << endl;
for (int i = 0; i < rowA; i++) {
for (int j = 0; j < colA; j++) {
cout << left << setw(6) << arrayA[i][j] << " ";
}
cout << endl;
}
return 0;
_getch();
}

Obviously your access to the array is out of bounds as you have your array indices go beyond the size of the array dimensions.
Given that this will not be the last time you run into this kind of problems, my answer will give you tips on how to detect if stuff like that goes wrong.
Your first tool on the rack is to add assertions to your code which make the problem evident in a debug build of your code.
#include <cassert>
// ...
while (myfile.good()) {
while (getline(myfile, lineA)) {
istringstream streamA(lineA);
colA = 0;
while (streamA >> x) {
// probably you would have noticed the error below while writing
// those assertions as obviously you would notice that you want 500
// rows and not 2 and you want 2 columns, not 500...
assert( rowA < 2 ); // <<--- This will assert!
assert( colA < 500 ); // <<--- This will assert if there are more than 500 lines in the file.
arrayA[rowA][colA] = x;
colA++; }
rowA++; }
With only those 2 extra lines (and the include), you would have been able to see where your code messes up.
In this case, fixing your code is quite easy and I leave it to you as an exercise ;)
In order to avoid mixing up the indices for your multi-dimensional array, you could write your code in a more suggestive manner (more readable).
int main()
{
string lineA;
int x;
const int NUM_COLUMNS = 2;
const int NUM_ROWS = 500;
float arrayA[NUM_COLUMNS][NUM_ROWS] = { { 0 } };
// ...
This tiny bit of extra expressiveness increases your odds to notice that your array access further below uses the wrong index variables per array dimension.
Last not least, you should add extra checks given that your program only works correctly (after fixing it) if the input file does not violate your assumptions (2 columns, less than 501 rows). This falls under the chapter of "defensive programming" - i.e. your code protects itself from violations of assumptions outside its scope of control.
You repeat your error in the print-loop at the bottom, btw. There, too you could add assertions.

C++ file writing/reading

I'm trying to create an array, write array to the file and than display it. It seems to be working but i get just part of the output (first 3 elements) or i get values over boundaries.
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
int arr[20];
int i;
for (i = 0; i < 5; i++)
{
cout << "Enter the value to the array: " << endl;
cin >> arr[i];
}
ofstream fl("numbers.txt");
if (!fl)
{
cout << "file could not be open for writing ! " <<endl;
}
for (i = 0; i < arr[i]; i++)
{
fl<<arr[i]<<endl;
}
fl.close();
ifstream file("numbers.txt");
if(!file)
{
cout << "Error reading from file ! " << endl;
}
while (!file.eof())
{
std::string inp;
getline(file,inp);
cout << inp << endl;
}
file.close();
return 0;
}

The terminating condition in the for loop is incorrect:
for(i=0;i<arr[i];i++)
If the user enters the following 5 ints:
1 0 4 5 6
the for loop will terminate at the second int, the 0, as 1 < 0 (which is what i<arr[i] would equate to) is false. The code has the potential to access beyond the bounds of the array, for input:
10 11 12 13 14
the for loop will iterate beyond the first 5 elements and start processing unitialised values in the array arr as it has not been initialised:
int arr[20];
which could result in out of bounds access on the array if the elements in arr happen to always be greater than i.
A simple fix:
for(i=0;i<5;i++)
Other points:
always check the result of I/O operations to ensure variables contain valid values:
if (!(cin >> arr[i]))
{
// Failed to read an int.
break;
}
the for loop must store the number of ints read into the arr, so the remainder of the code only processes values provided by the user. An alternative to using an array, with a fixed size, and a variable to indicate the number of populated elements is to use a std::vector<int> that would contain only valid ints (and can be queried for its size() or iterated using iterators).
while (!file.eof()) is not correct as the end of file flag will set only once a read attempted to read beyond the end of the file. Check the result of I/O operations immediately:
while (std::getline(file, inp))
{
}

its like hmjd said
for(i=0;i<arr[i];i++)
looks wrong
it should look like this
int size;
size=sizeof(your array);
for(i=0;i<size;i++)

Try this:
//for(i=0;i<arr[i];i++)
for(i=0;i<5;i++)
[EDITED]
I would initialize the array with 0 like this: int arr[20] = {0}; In this case you can use for example:
while ((arr[i] != 0 || i < sizeof(arr))

i<array[i]
It is wrong beacuse it comapres with the content of the array ,it does not check the size of array .

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Segmentation fault while reading a large array from a file. C++/gcc - c++

Related

Why does the code throw illegal memory access error

Program only works with inclusion of (side effects free) cout statements?

C++ program only lists last value entered into an array

Crash in a C++ tutorial program (reading matrix)

C++ file writing/reading

Categories

Resources