Split text file into multiple files c++ - c++

I'm trying to split txt file into few new files. That's what I've done so far:
long c = 0;
string s;
vector<string> v;
I need to count how many lines my txt file has (it works):
while(getline(inputFile, s, '\n')){
v.push_back(s);
c++;
}
long lineNumber = c;
long max = 100;
long nFiles;
checking how many new files will be created:
if((lineNumber % max) ==0)
nFiles = lineNumber/max;
else
nFiles = lineNumber/max + 1;
creating new names of files:
long currentLine = 0;
for(long i = 1; i <= nFiles; i++){
stringstream sstream;
string a_i;
sstream <<i;
sstream >> a_i;
string outputfiles = "name" +"_" + a_i +".txt";
ofstream fout(outputfiles.c_str());
for (int j = currentLine; j<max; j++){
fout << v[j]<<endl;
}
fout.close();
currentLine = max;
}
inputFile.close();
It creates files but then suddenly stops working. Does anyone know why?

This is a prime example of a time where using a debugger could help you out.
You loop here:
for (int j = currentLine; j<max; j++){
fout << line[j]<<endl;
}
fout.close();
currentLine = max;
max = max + nMax;
max can be bigger than the size of line and this will cause a segmentation fault when you try to access line[j]. This inner loop really should check that you are not going over the length of line which you could find with line.size(). Even after you fix this the program logic isn't quite right, line doesn't appear to grow in size yet in each iteration of the outer loop you make the accesses to line move an additional max indexes along, this will always fail in the last file you try to write if you don't stop the loop at the end of line.

Related

Performance bottleneck in writing a large matrix of doubles to a file

My program opens a file which contains 100,000 numbers and parses them out into a 10,000 x 10 array correlating to 10,000 sets of 10 physical parameters. The program then iterates through each row of the array, performing overlap calculations between that row and every other row in the array.
The process is quite simple, and being new to c++, I programmed it the most straightforward way that I could think of. However, I know that I'm not doing this in the most optimal way possible, which is something that I would love to do, as the program is going to face off against my cohort's identical program, coded in Fortran, in a "race".
I have a feeling that I am going to need to implement multithreading to accomplish my goal of speeding up the program, but not only am I new to c++, I am new to multithreading, so I'm not sure how I should go about creating new threads in a beneficial way, or if it is even something that would give me that much "gain on investment" so to speak.
The program has the potential to be run on a machine with over 50 cores, but because the program is so simple, I'm not convinced that more threads is necessarily better. I think that if I implement two threads to compute the complex parameters of the two gaussians, one thread to compute the overlap between the gaussians, and one thread that is dedicated to writing to the file, I could speed up the program significantly, but I could also be wrong.
CODE:
cout << "Working...\n";
double **gaussian_array;
gaussian_array = (double **)malloc(N*sizeof(double *));
for(int i = 0; i < N; i++){
gaussian_array[i] = (double *)malloc(10*sizeof(double));
}
fstream gaussians;
gaussians.open("GaussParams", ios::in);
if (!gaussians){
cout << "File not found.";
}
else {
//generate the array of gaussians -> [10000][10]
int i = 0;
while(i < N) {
char ch;
string strNums;
string Num;
string strtab[10];
int j = 0;
getline(gaussians, strNums);
stringstream gaussian(strNums);
while(gaussian >> ch) {
if(ch != ',') {
Num += ch;
strtab[j] = Num;
}
else {
Num = "";
j += 1;
}
}
for(int c = 0; c < 10; c++) {
stringstream dbl(strtab[c]);
dbl >> gaussian_array[i][c];
}
i += 1;
}
}
gaussians.close();
//Below is the process to generate the overlap file between all gaussians:
string buffer;
ofstream overlaps;
overlaps.open("OverlapMatrix", ios::trunc);
overlaps.precision(15);
for(int i = 0; i < N; i++) {
for(int j = 0 ; j < N; j++){
double r1[6][2];
double r2[6][2];
double ol[2];
//compute complex parameters from the two gaussians
compute_params(gaussian_array[i], r1);
compute_params(gaussian_array[j], r2);
//compute overlap between the gaussians using the complex parameters
compute_overlap(r1, r2, ol);
//write to file
overlaps << ol[0] << "," << ol[1];
if(j < N - 1)
overlaps << " ";
else
overlaps << "\n";
}
}
overlaps.close();
return 0;
Any suggestions are greatly appreciated. Thanks!

C++ 'std::bad_alloc' what(): std::bad_alloc

I am trying to run the below C++ code and I get this error :
Could anyone please help me clarify why is this the issue
Input : input/text_4.txt 9
terminate called after throwing an instance of 'std::bad_alloc' what(): std::bad_alloc
Aborted (core dumped)
After reading a few similar threads, the solution is to check dynamic memory allocation. However, my code does not have any dynamically allocated memory
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include <sys/types.h>
#include <sys/stat.h>
using namespace std;
vector<string> arrangefile(vector<string>& scale, int width, int &number) {
int beginning = 0; int total = 0;
vector<string> result;
for(int i = 0; i < scale.size(); i++)
{
total += scale[i].size(); // add length of each word
if(total + i - beginning > width) // checking if the value has exceeded the maximum width
{
total -= scale[i].size();
string sentence= "",low="";
int last = i-1;
int space = width - total; // calculate number of spaces in each line
int check = max(last-beginning, 1);
int even = space/check;
while(even--){
low += " ";
}
int mod = space%check;
for(int j = beginning; j <= last; j++)
{
sentence += scale[j]; //find all values in a sentence
if(j < last || beginning == last)
sentence += low; // add the word low to the larger sentence
if(j - beginning < mod)
sentence += " ";
}
result.push_back(sentence); // add the sentence to the vector
number++; // counts the number of sentences
beginning = i;
total = scale[i].size();
}
}
string sentence =""; // for the last line
int last = scale.size()-1;
int check = last-beginning;
int space = width - total - check;
string low="";
while(space--){
low += " ";
}
for(int j = beginning; j <= last; j++)
{
sentence += scale[j];
if(j < last){
sentence += " ";
}
}
sentence += low;
result.push_back(sentence); // // add the sentence to the vector
number++; // counts the number of sentences
return result;
}
int main(){
string filepath, word;
int M, number=0;
cin >> filepath;
cin >> M;
ifstream fin;
fin.open(filepath.c_str());
unsigned found = filepath.find_last_of("/");
string b = filepath.substr(found+1);
int create = b.size();
string between = b.substr(0, create-4);
string final = between + "_formatted.txt";
string ending = "output/" + final;
mkdir ("output", 0777);
ofstream fout;
fout.open(ending);
for(int i = 0, count = 0; i<M; i++, count ++){
if(count == 9){
fout<<count;
count = -1;
}
else
fout<<count;
}
fout<<endl;
vector <string> first;
vector <string> second;
while(fin >> word){
first.push_back(word);
}
if(first.empty()){
cout<<"0 formatted lines written to "<< ending<<endl;
}
else{
second = arrangefile(first, M,number);
for (auto i = second.begin(); i != second.end(); ++i)
fout << *i <<endl;
cout<<number<<" formatted lines written to "<<ending<<endl;
}
fin.close();
fout.close();
return 0;
}
input file text_4.txt:
This is because not very many happy things happened
in the lives of the three Baudelaire youngsters.
Input: input/text_4.txt 8
When I run your code, on the i==16 iteration of the outer loop in arrangefile, we get width==8 and total==10, with check==1. As a result, even is initialized to -2, and so the while(even--) loop is (nearly) infinite. So it attempts to add spaces to low until it runs out of memory.
(Note that the memory used by std::string is dynamically allocated, so your code does have dynamic memory allocation. The same for std::vector.)
I haven't analyzed your algorithm closely enough to figure out the correct fix, but it's possible your loop should be while(even-- > 0) instead.
I'll second the tip in the comments to use your debugger, and I'll repost the link: What is a debugger and how can it help me diagnose problems?. That's how I found this bug.
I ran the program under the debugger gdb. It ran for a few seconds, at which point I got suspicious because the program doesn't appear do anything complicated enough to take that much computation time. So I interrupted the program (Ctrl-C) which let me see where it was and what it was doing. I could see that it was within the while(even--) loop. That was also suspicious because that loop should complete very fast. So I inspected the value of even (with the command p even) and saw that it was a large negative number. That could only happen if it had started as a negative number, which logically could only happen if total were greater than width. Inspecting their values I could see that this was indeed the case.
Maybe this will be helpful as you learn more about using your debugger.

Reading list of integers into an Array C++

I am working on a program in C++ that will read integers from a file, then pass them to a function that checks for a Subset Sum.
The file is formatted like so:
number of cases n
sum for case 1
list of integers separated by a space
sum for case
list of integers separated by a space
sum for case n
list of integers separated by a space
My problem now lies in how to read the list of integers into an array to be passed to my function.
This is my main so far:
fstream infile("subset.txt");
if(infile.is_open()){
int numCases, num;
infile >> numCases;
while(infile >> num){
for(int i = 0; i < numCases; i++)
{
int sum;
int set[30];
num >> sum;
for(int i = 0; i < 30; i++)
{
if(num == '\n')
{
sum[i] = -1
}
else
{
num << sum[i]
}
}
int n = sizeof(set)/sizeof(set[0]);
if(subsetSum(set, n, sum) == true)
printf("True");
else
printf("False");
}
}
}
else
printf("File did not open correctly.");
return 0;
Any help you guys can give me would be greatly appreciated.
Yes, this is for an assignment, so if you would rather just give me hints that would be appreciated as well. The assignment is for the algorithm and I have that working, I just need a hand with the I/O.
I would read the line containing the list of numbers using std::getline, then use an istringstream to parse numbers out of that string.
I'd also use a std::vector instead of an array to hold the numbers. For the actual parsing, I'd probably use a pair of std::istream_iterators, so the code would look something like this:
while (infile >> sum) {
std::getline(infile, line);
std::istringstream buffer(line);
std::vector<int> numbers{std::istream_iterator<int>(buffer),
std::istream_iterator<int>()};
std::cout << std::boolalpha << subsetSum(numbers, sum);
}

Data parsing from text file

i have encountered an issue regarding parsing values from a text file. What i am trying to do is i need to add up all the values for each specific events for all days and find the average of it. Example will be (290+370+346+325+325)/5 and (5+5+5+12)/4 based on the data in the text file.
A sample is listed below
For each line --> First event:Second event:Third event...:Total number of event:
Every new line is considered a new day.
3:290:61:148:2:5:
2:370:50:173:4:5:
5:346:87:131:4:
3:325:60:145:5:5:
3:325:60:145:5:12:13:7:
I have tried to do it myself but i have only managed to store each column in a string array only. Sample code below. Will appreciate if you guys can help, thanks!
void IDS::parseBase() {
string temp = "";
int counting = 0;
int maxEvent = 0;
int noOfLines = 0;
vector<string> baseVector;
ifstream readBaseFile("Base-Data.txt");
ifstream readBaseFileAgain("Base-Data.txt");
while (getline(readBaseFile, temp)) {
baseVector.push_back(temp);
}
readBaseFile.close();
//Fine the no. of lines
noOfLines = baseVector.size();
//Find the no. of events
for (int i=0; i<baseVector.size(); i++)
{
counting = count(baseVector[i].begin(), baseVector[i].end(), ':') - 1;
if (maxEvent < counting)
{
maxEvent = counting;
}
}
//Store individual events into array
string a[maxEvent];
while (getline(readBaseFileAgain, temp)) {
stringstream streamTemp(temp);
for (int i=0; i<maxEvent; i++)
{
getline(streamTemp, temp, ':');
a[i] += temp + "\n";
}
}
}
I suggest:
int a[maxEvent];
char c; // to hold the colon
while(streamTemp >> a[i++] >> c);

C++ how make a 2D array using strings and spaces instead of ints

I am making a Sudoku program and my i have a test.txt file that reads
53__7____
6__195___
_98____6_
8___6___3
4__8_3__1
7___2___6
_6____28_
___419__5
____8__79
where the "_" are actually spaces. The reason i give you _ is so you can see that there are literally only 9 characters on each line.
I was thinking that I would have to do something like having GRID[row][column], however I frankly don't know what types I should put my arrays as and I am just lost.
I simply want to make it so when i want to output say GRID[0][0] it returns 5, while if i ask for GRID[0][3] it returns a ' '.
It is getting it so the array store both the numbers and the spaces is where i am getting completely lost
What I currently have tried so far:
int main()
{
ifstream myfile(test.txt);
string line;
char sudoku_grid[9][9];
if (myfile.is_open())
{
while(myfile.good())
{
getline(myfile, line);
cout << sudoku_grid[line] << endl;
}
myfile.close();
}
else cout << "error";
return 0;
}
it returns the error line 12: no match for 'operator [ ]' in 'sudoku_grid[line]'
Here is my attempt though guidelines through you guys:
int main()
{
ifstream myfile(test.txt);
string line;
char sudoku_grid[9][9];
if (myfile.good())
{
for(int i = 0; i < 9; i++)
{
getline(myfile, line);
for(int j = 0; j < 9; j++)
{
if (line[j] == ' ')
sudoku_grid[j][i] = -1;
else sudoku_grid[j][i] = line[i];
}
cout << sudoku_grid[i] << endl;
}
myfile.close();
}
else cout << "error";
return 0;
}
The result is a very awkward answer of strange letters and a new numbers.
I'll just give you the algorithm/logic, not going to write the code for you. Try it and come back when stuck.
Initialize output in memory 2D array: numbers[9][9]
Open the file
Until there is no line left in the file:
a. Get the line i
b. Until there are no more characters in the line:
b1. Get each character of the line c
b2. If the character is not space, then numbers[i]=c, else numbers[i]=-1
Your array can be made up of int and in b2 if a whitespace is encountered you can insert -1 to indicate the absence of a number. Of course your code manipulating numbers array needs to take that into account.
Since you need to store both chars and integer type values, use char. each of your integer lies in the range 0-9, so can be stored as a character.
char Grid[9][9];
now you can read each character from the string and store it in the array. It will not only keep your spaces intact but also each character. Always remember to use ASCII codes to access the elements of the grid. For 0-9, ASCII codes are 48-57, ASCII code for space is 32.
Hope it helps...
Edit code: Here is the simplest example... PLace your test file in d:, or edit the path of file in code
int main (void)
{
FILE *fp = fopen("d:\\test.txt","r");
char sudoku_grid[9][9], ch;
// I am assuming that file is valid and data in that is also valid
if(fp)
{
for(int i = 0; i < 9; i++)
{
for(int j = 0; j < 9; j++)
{
//too read each character
ch = fgetc(fp);
sudoku_grid[i][j] = ch;
}
// to read '\n' from the line
ch = fgetc(fp);
}
//for checking if data went correctly
for(int i = 0; i< 9;i++)
{
for(int j= 0; j<9;j++)
cout<<sudoku_grid[i][j];
cout<<endl;
}
}
return 0;
}
In the first code you get the error message because sudoku_grid can
only be indexed by numbers and not by strings.
In the second code the line
sudoku_grid[j][i] = line[i];
should probably be
sudoku_grid[j][i] = line[j];
Does this answer your question?