I have two text files, each with an unknown number of integers sorted from lowest to highest... for example:
input file 1: 1 3 5 7 9 11...
input file 2: 2 4 6 8 10 ....
I want to take these numbers from both files, sort from low to high, and then output the full list of sorted numbers from both input files to a single output file. What I have so far...
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
#include "iosort.h"
int main()
{
const char* filename1 = "numberlist1.txt";
const char* filename2 = "numberlist2.txt";
std::ofstream ofs("output.txt");
std::ifstream ifs1, ifs2;
std::string input1, input2;
ifs1.open(filename1);
std::getline(ifs1, input1);
std::cout << "Contents of file 1: " << input1 << std::endl;
ifs2.open(filename2);
std::getline(ifs2, input2);
std::cout << "Contents of file 2: " << input2 << std::endl;
ioSort(ifs1, ifs2, ofs);
return 0;
}
and my function...
#include <fstream>
#include <sstream>
#include <vector>
#include "iosort.h"
void ioSort(std::ifstream& in1, std::ifstream& in2, std::ofstream& out)
{
int a, b;
std::vector<int> f1, f2, f3; //create one vector for each input stream
while (in1 >> a)
{
f1.push_back(a);
}
while (in2 >> b)
{
f2.push_back(b);
}
//now f1 and f2 are vectors that have the numbers from the input files
//we know that in these input files numbers are sorted from low to high
if (f1.size() > f2.size()) //input stream 1 was larger
{
for (int i = 0; i < f2.size(); i++)
{
if (f1[i] > f2[i]) //number at input vector 2 less that respective pos
{ //in input vector 1
f3.push_back(f2[i]);
}
else if(f1[i] == f2[i]) //numbers are equal
{
f3.push_back(f1[i]);
f3.push_back(f2[i]);
}
else //number in 1 is less than that in vector 2
{
f3.push_back(f1[i]);
}
}
for (int i = f2.size(); i < f1.size(); i++)
{
f3.push_back(f1[i]); //push remaining numbers from stream 1 into vector
}
}
else //input stream 2 was larger
{
for (int i = 0; i < f1.size(); i++)
{
if (f1[i] > f2[i]) //number at input vector 2 less that respective pos
{ //in input vector 1
f3.push_back(f2[i]);
}
else if(f1[i] == f2[i]) //numbers are equal
{
f3.push_back(f1[i]);
f3.push_back(f2[i]);
}
else //number in 1 is less than that in vector 2
{
f3.push_back(f1[i]);
}
}
for (int i = f1.size(); i < f2.size(); i++)
{
f3.push_back(f1[i]); //push remaining numbers from stream 2 into vector
}
}
//send vector contents to output file
for (int i = 0; i < f3.size(); i++)
{
out << f3[i] << " ";
}
}
Everytime I compile and run, the file output.txt is being created, but it is empty. Can anybody point me to what I am doing wrong. If, in main, I do something like:
out << 8 << " " << 9 << std::endl;
then it will show up in the output file.
AHA! Found your error. You're opening the file, then reading it directly to stdout (where you list the contents of your file), and then passing the same stream into your function. You cannot do this. Whenever you read from a file, the stream moves further through the file. By the time you're in your sorting function, you're at the end of the file, and so no numbers are read!
You need to remove the lines
std::getline(ifs1, input1);
std::cout << "Contents of file 1: " << input1 << std::endl;
and
std::getline(ifs2, input2);
std::cout << "Contents of file 2: " << input2 << std::endl;
Instead, print them out after you've stored them in the vector.
I'll leave the rest of my reply down below, since you, or posterity, might need it.
I'm not sure what's going on with your output file problem. Go through the whole chain and see where it's failing:
After you've read your file in, print out f1 and f2 with cout. Are they there and what you expect? If they are, we can move on.
After your algorithm has run, is your f3 there, and what you expect? If so, keep going!
This lets you diagnose the exact line where your code is failing (i.e. not doing it what you expect it do), and you know you can rule everything you've checked out.
Of course, instead of using cout you can launch this under a debugging environment and see what happens step by step, but if you don't know how, it'll take longer to do that to diagnose your problem the first time.
You do have other problems though, your merge function has errors. You end up skipping certain elements because you're only using one index for both arrays. Think about it: you only push one number into your output array in the f1[i] > f2[i] or the f1[i] < f2[i], but you discard both by incrementing i.
You can take your merge loop and simplify it by a lot, while also fixing your mistake :).
auto it = f1.cbegin();
auto jt = f2.cbegin();
while (it != f1.cend() && jt != f2.cend()) {
if (*it < *jt) f3.push_back(*jt++); //f2 was bigger, push f2>f3 and increment f2 index
else if (*it > *jt) f3.push_back(*it++); //f1 was bigger, push f1>f3 and increment f1 index
else { //implicit equals, only option left
f3.push_back(*jt++);
f3.push_back(*it++);
}
}
while (it != f1.cend()) f3.push_back(*it++);
while (jt != f2.cend()) f3.push_back(*jt++);
So now f3 contains your sorted array, sorted in O(m+n) time. If you're doing this for the sake of learning, I'd try to remedy your error using your way first before switching over to this.
If you want to write less code and speed isn't a problem, you can use <algorithm> to do this, too, but it's a terrible O((n+m)lg(n+m)).
auto it = f1.cbegin();
auto jt = f2.cbegin();
while (it != f1.cend()) f3.push_back(*it++);
while (jt != f2.cend()) f3.push_back(*jt++);
std::sort(f3.begin(), f3.end());
Since you're reading the file with std::getline() before calling ioSort(), there's nothing for the sorting function to read.
You can rewind back to the beginning of the file with seekg().
ifs1.clear();
ifs1.seekg(0, ifs1.beg);
ifs2.clear();
ifs2.seekg(0, ifs1.beg);
ioSort(ifs1, ifs2, ofs);
See How to read same file twice in a row
In order to be short:
#include <fstream>
#include <algorithm>
#include <iterator>
int main()
{
std::ifstream infile1("infile1.txt");
std::ifstream infile2("infile2.txt");
std::ofstream outfile("outfile.txt");
std::merge(
std::istream_iterator<int>{infile1}, std::istream_iterator<int>{},
std::istream_iterator<int>{infile2}, std::istream_iterator<int>{},
std::ostream_iterator<int>{outfile, " "}
);
}
std::merge is an STL algorithm that merge two sorted ranges into one sorted range. And the ranges are the files for this case. The files are viewed as ranges using std::istream_iterator<int>. The output file is accessed as a range using std::ostream_iterator<int>.
Related
Would anyone be able to help me, I am trying to make a C++ program which reads values from csv file and prints them(there are going to be three 4 rows, 3 columns). I want to know how can I add the rows (like the sum of the first row =? the sum of second-row = ?...)
The matrix looks like this:
And my program looks like:
#include <iostream>
#include <fstream>
using namespace std;
int main() {
ifstream sumCol;
sumCol.open("Col.csv");
string line;
int sum;
cout << "The entered matrix: " << endl;
while(sumCol.good()){
string line;
getline(sumCol, line, ',');
cout << line << " ";
}
while(sumCol.good()){
getline(sumCol, line, ',');
int i = stoi(line);
cout << endl;
cout << i;
}
while(getline(sumCol, line, ',')){
int i = stoi(line);
sum = sum + i;
getline(sumCol, line, ',');
i = stoi(line);
sum = sum + i;
getline(sumCol, line);
i = stoi(line);
sum = sum + i;
}
cout << sum << endl;
return 0;
}
Next try. Very simple answer with just basic constructs.
12 statements. But 2 for loops, nested and hardcoded magic numbers for rows and columns.
Please see the below well commented source code:
#include <iostream>
#include <fstream>
int main() {
// Open file
std::ifstream csv("col.csv");
// and check, if it could be opened
if (csv) {
// Now the file is open. We have 3 rows and 4 columns
// We will use 2 nested for loops, one for the rows and one for the columns
// So, first the rows
for (int row = 0; row < 3; ++row) {
// Every row has a sum
int sumForOneRow = 0;
// And every row has 4 columns. Go through all coulmns of the current row.
for (int col = 0; col < 4; ++col) {
// Read an integer value from the current line
int integerValue;
csv >> integerValue;
// Show it on the screen
std::cout << integerValue << ' ';
// Update the sum of the row
sumForOneRow = sumForOneRow + integerValue;
}
// Now, the inner for loop for the 4 columns is done. Show sum to user
std::cout << " --> " << sumForOneRow << '\n';
// Line activities are done now for this line. Go on with next line
}
}
else std::cerr << "\n*** Error: Could not open 'col.csv'\n";
return 0;
}
Third try.
Obviously the matrix, shown in the question, does not reflect the real data. The real data might look like that:
9, 1, 2, 4
9, 2, 8, 0
3, 3, 3, 3
Without using std::getline we can do like the below:
#include <iostream>
#include <fstream>
int main() {
// Open file
std::ifstream csv("r:\\col.csv");
// and check, if it could be opened
if (csv) {
// Now the file is open. We have 3 rows and 4 columns
// We will use 2 nested for loops, one for the rows and one for the columns
// So, first the rows
for (int row = 0; row < 3; ++row) {
// Every row has a sum
int sumForOneRow = 0;
// And every row has 4 columns. Go through all coulmns of the current row.
for (int col = 0; col < 4; ++col) {
// Read an integer value from the current line
int integerValue;
char c; // for the comma
// The last value, the value in column 3 (We start counting with 0) is not followed by a comma
// Therefore we add special treatment for the last column
if (col == 3)
csv >> integerValue; // Read just the value
else
csv >> integerValue >> c; // Read value and comma
// Show it on the screen
std::cout << integerValue << ' ';
// Update the sum of the row
sumForOneRow = sumForOneRow + integerValue;
}
// Now, the inner for loop for the 4 columns is done. Show sum to user
std::cout << " --> " << sumForOneRow << '\n';
// Line activities are done now for this line. Go on with next line
}
}
else std::cerr << "\n*** Error: Could not open 'col.csv'\n";
return 0;
}
4th try. Based on the evolution of this thread and the request of the OP to use std::getline
#include <iostream>
#include <fstream>
#include <string>
int main() {
// Open file
std::ifstream csv("r:\\col.csv");
// and check, if it could be opened
if (csv) {
// Now the file is open. We have 3 rows and 4 columns
// We will use 2 nested for loops, one for the rows and one for the columns
// So, first the rows
for (int row = 0; row < 3; ++row) {
// Every row has a sum
int sumForOneRow = 0;
// And every row has 4 columns. Go through all coulmns of the current row.
for (int col = 0; col < 4; ++col) {
// Read a substring up to the next comma or end of line
std::string line;
// Special handling for last column. This is not followed by a comma
if (col == 3)
std::getline(csv, line);
else
std::getline(csv, line, ',');
// Convert string to line
int integerValue = std::stoi(line);
// Show it on the screen
std::cout << integerValue << ' ';
// Update the sum of the row
sumForOneRow = sumForOneRow + integerValue;
}
// Now, the inner for loop for the 4 columns is done. Show sum to user
std::cout << " --> " << sumForOneRow << '\n';
// Line activities are done now for this line. Go on with next line
}
}
else std::cerr << "\n*** Error: Could not open 'col.csv'\n";
return 0;
}
IMHO this is more complicated than 3rd try
Yes, I can help you with a piece of code, but I am not sure, if you will understand the more modern C++ language elements.
So, what do we need to do?
Read line by line of the source csv string
Get all integer values for this line
The interger values are arranged in columns and separated by white space
Show the values and the screen
Sum up the values in onbe line and show the sum
And in the resulting code, we will do exactly that.
So, first, we open the file, and check, it it could be opened. Then, in a simple for loop, we read all lines that are present in the source code. Line by line.
In the body of the for loop, we take the current line that we just read, and put it into an std::istringstream. We do that to make extraction of the integer value simpler. And for the extraction, we use the std::istream_iterator. This will iterate over all integers in the line and return them.
We will store the values temporary in a std::vector. We use the range constructor of the std::vector. The begin iterator is the std::istream_iterator for the data type int and and for our std::istringstream. The end iterator is the default constructed std::istream_iterator, so, simply {}. This will copy all values into the std::vector
That is a powerful and compact on-liner.
We copy the values in the std::vector, so, the integer values of one line to the console.
Then, we add up all values in the std::vector (all integers from one line) and show the result on the screen.
And the beauty of it: It doesn't matter, how many rows and columns are present. It will work always. The only thing we need is space separated integer values in lines.
And all this can be done with just 7 statements . . .
Please see:
#include <iostream>
#include <fstream>
#include <sstream>
#include <string>
#include <vector>
#include <algorithm>
#include <iterator>
#include <numeric>
int main() {
// Open file and check, if it could be opened
if (std::ifstream csv{ "col.csv" }; csv) {
// Read all lines from the source file
for (std::string line{}; std::getline(csv, line); ) {
// Put the current line in a stringstream for better extraction
std::istringstream iss{ line };
// Exctract all integer values from this line and put them int a vector
std::vector values(std::istream_iterator<int>(iss), {});
// Show values on display
std::copy(values.begin(), values.end(), std::ostream_iterator<int>(std::cout, " "));
// Claculate the sum for one line on show on display.
std::cout << " --> " << std::accumulate(values.begin(), values.end(), 0) << '\n';
}
}
else std::cerr << "\n*** Error: Could not open 'col.csv'\n";
return 0;
}
Language is C++ 17
My program works for small files, but if I use large files (bible, Artamenes (longest novel)) it never finishes. The program keeps using more memory. It starts with 5mb and was up to over 350 in 7 hours. Is it because it is very inefficient or am I missing something?
#include "stdafx.h"
#include <iostream>
#include <string>
#include <fstream>
#include <vector>
#include <algorithm>
using namespace std;
struct Pair // create a struct for each word so it includes not only the word, but its count
{
string word; //the inputted word, eventually
unsigned int frequency; //count for each word
Pair(unsigned int f, const string& w) : frequency(f), word(w) {} //create constructor
bool operator <(const Pair& str) const //for sort
{
return (str.frequency < frequency);
}
};
string rmPunct (string word)
{
unsigned int position;
while ((position = word.find_first_of("|.,:;\"'!¡?¿/()^[]{}\\;-_*+")) != string::npos) //remove any punctuation, etc.
{
word.erase(position, 1);
}
return word;
}
string allLower(string word)
{
std::transform(word.begin(), word.end(), word.begin(), ::tolower); //convert any uppercase letters to lower case
return word;
}
int main()
{
vector<Pair> myVector; //Create a vector of structs so I have a dynamic array (can extend)
fstream dataFile; // create the file stream
string fileName; // necessary to find the file
cout << "Enter the file name: ";
cin >> fileName;
dataFile.open(fileName); // open the file in input mode only (no output for safeness)
string word; //will be each word from the file
while (dataFile >> word) // the >> imports each word until it hits a space then loops again
{
word = rmPunct(word);
word = allLower(word);
Pair *p = new Pair(1,word);
myVector.push_back(*p); // pushes each newly created struct into the vector
if (dataFile.fail())
break; //stop when the file is done
}
for (unsigned int i=0;i<myVector.size();i++) //this double for-loop finds each word that was already found
{
for (unsigned int j = i+1;j<myVector.size();)
{
if (myVector[i].word == myVector[j].word) //simple comparing to find where the extra word lies
{
myVector.at(i).frequency++; //increment the count
myVector.erase(myVector.begin()+j);//and... delete the duplicate struct (which has the word in it)
}
else
j++;
}
}
sort(myVector.begin(), myVector.end());
ofstream results;
results.open("results.txt");
if (myVector.size() >= 60) //outputs the top 60 most common words
{
for (int i=0;i<60;i++) {
double percent = ((double)myVector[i].frequency/(double)myVector.size()*100);
results << (i+1) << ". '" << myVector[i].word << "' occured " << myVector[i].frequency << " times. " << percent << "%" << '\n';
}
}
else //if there are not 60 unique words in the file
for (unsigned int i=0;i<myVector.size(); i++)
{
double percent = ((double)myVector[i].frequency/(double)myVector.size()*100);
results << (i+1) << ". '" << myVector[i].word << "' occured " << myVector[i].frequency << " times. " << percent << "%" << '\n';
}
results.close();
}
This loop:
for (unsigned int i=0;i<myVector.size();i++) //this double for-loop finds each word that was already found
{
for (unsigned int j = i+1;j<myVector.size();)
{
if (myVector[i].word == myVector[j].word) //simple comparing to find where the extra word lies
{
myVector.at(i).frequency++; //increment the count
myVector.erase(myVector.begin()+j);//and... delete the duplicate struct (which has the word in it)
}
else
j++;
}
}
walks your words n^2 times (roughly). If we assume your 5MB file contains half a million words, thats 500000 * 500000 = 250 billion iterations, which will take some time to run through [and erasing words will "shuffle" the entire content of your vector, which is quite time-consuming it the vector is long and you shuffle an early item]
A better approach would be to build a data structure where you can quickly search through, such as a map<std::string, int> words, where you do words[word]++; when you read the words. Then search for the most common word by iterating of words and saving the 60 most common words [keeping a sorted list of the 60 most common...]
You could also do something clever like min(60, words.size()) to know how many words you have.
You have a small memory leak in your program, and as the data you read gets larger so does the number of leaks.
The code causing the memory leak:
Pair *p = new Pair(1,word);
myVector.push_back(*p); // pushes each newly created struct into the vector
Her you dynamically allocate a Pair structure, copy the structure to the vector, and the completely ignore the original allocated structure.
There is really no need for any dynamic allocation, or even a temporary variable, just do
myVector.push_back(Pair(1, word));
And if you have a new compiler that have C++11, then just do
myVector.emplace_back(1, word);
That should help you with part of the problem.
The other part is that your algorithm is slow, really really slow for large inputs.
That can be solved by using e.g. std::unordered_map (or std::map if you don't have std::unordered_map).
Then it becomes very simple, just use the word as the key, and the frequence as the data. Then for every word you read just do
frequencyMap[word]++;
No need for the loop-in-loop comparing words, which is what slows you down.
To get the 60 most frequent words, copy from the map into a vector using std::pair with the frequency as the first member of the pair and the word as the second, sort the vector, and simply print the 60 first entries in the vector.
I am working on NOV14 on COdechef contest problems. and i stuck at this problem.
http://www.codechef.com/NOV14/problems/RBTREE
My algorithm working well, but i cant take the input correctly. the problem is i don't know how many number of inputs are given. but i need to store in multiple variables.
Take a look at here..
5
Qb 4 5
Qr 4 5
Qi
Qb 4 5
Qr 4 5
where 5 is the number of test cases,
can i read every test cases into variables.
if i take First test case I can take Qb to one variable, 4 to other and 5 to another.
But the problem is How to read a line which start with Qi.
Well, first of all, if you write C++, you should use C++ streams. Here's the code for input (which you can adjust for your own needs):
#include <iostream>
#include <fstream>
int main() {
std::ifstream file;
file.open("data.in");
int lines = 0;
file >> lines;
std::string query_type;
for (int i = 0; i < lines; i++) {
file >> query_type;
if (query_type == "Qi") {
std::cout << query_type << std::endl;
} else {
int x = 0;
int y = 0;
file >> x >> y;
std::cout << query_type << " " << x << " " << y << std::endl;
}
}
file.close();
return 0;
}
You'll need to check what you've read at each step, and then determine whether or not you need to read the numbers in.
So read two characters, and if the characters you've read are "Q" and "i", you don't need to read any numbers, and you can just step on to the next line. Otherwise, you should read the two numbers before going to the next line.
#include <iostream>
#include <iomanip>
#include <cstdlib>
#include <fstream>
using namespace std;
void make_array(ifstream &num, int (&array)[50]);
int main(){
ifstream file; // variable controlling the file
char filename[100]; /// to handle calling the file name;
int array[50];
cout << "Please enter the name of the file you wish to process:";
cin >> filename;
cout << "\n";
file.open(filename);
if(file.fail()){
cout << "The file failed to open.\n";
exit(1);
}
else{
cout << "File Opened Successfully.\n";
}
make_array(file, array);
file.close();
return(0);
}
void make_array(ifstream &num, int (&array)[50]){
int i = 0; // counter variable
while(!num.eof() && i < 50){
num >> array[i];
i = i + 1;
}
for(i; i>=0; i--){
cout << array[i] << "\n";
}
}
Alright, so this it my code so far. When I output the contents of the array, I get two really large negative numbers before the expected output. For example, if the file had 1 2 3 4 in it, my program is outputting -6438230 -293948 1 2 3 4.
Can somebody please tell me why I am getting these ridiculous values?
Your code outputs the array backwards, and also it increments i twice after it has finished reading all the values. This is why you see two garbage values at the start. I suspect you are misreporting your output and you actually saw -6438230 -293948 4 3 2 1.
You end up with the extra increments because your use of eof() is wrong. This is an amazingly common error for some reason. See here for further info. Write your loop like this instead:
while ( i < 50 && num >> array[i] )
++i;
Now i holds the number of valid items in the list. Assuming you do actually want to output them backwards:
while ( i-- > 0 )
cout << array[i] << "\n";
To output them forwards you'll need two variables (one to store the total number of items in the array, and one to do the iteration)
The check !num.eof() only tells you that the last thing you read was not eof. So, if your file was 1 2 3 4, the check will only kick in after the 5th num>>array[i] call. However, for that i, array[i] will be populated with a meaningless value. The only correct way to deal with eofs is to check for validity on every call to operator>>. In other words, the right condition is simply num>>array[i]. This works by exploiting this conversion to bool since C++11 and to void* pre-C++11.
How do you read lines from a vector and compare their lengths? I have pushed strings inside a vector and now would like to find the longest line and use that as my output. I have this as my code all the way up to comparing the strings:
ifstream code_File ("example.txt");
size_t find_Stop1, find_Stop2, find_Stop3, find_Start;
string line;
vector<string> code_Assign, code_Stop;
if (code_File.is_open()) {
while ( getline(code_File,line)) {
find_Start = line.find("AUG"); // Finding all posssible start codes
if (find_Start != string::npos) {
line = line.substr(find_Start);
code_Assign.push_back(line); //adding line to Code_Assign
find_Stop2 = line.find("UGA"); // Try and find stop code.
if (find_Stop2 != string::npos) {
line = line.substr(line.find("AUG"), find_Stop2);
code_Stop.push_back(line); // Adding it to code_Stop vector
}
find_Stop1 = line.find("UAA"); // finding all possible stop codes.
if (find_Stop1 != string::npos) {
line = line.substr(line.find("AUG"), find_Stop1); // Assign string code_1 from start code to UGA
code_Stop.push_back(line); //Adding to code_Stop vector
}
find_Stop3 = line.find("UAG"); // finding all possible stop codes.
if (find_Stop3 != string::npos) {
line = line.substr(line.find("AUG"), find_Stop3);
code_Stop.push_back(line); //Adding to code_Stop vector
}
}
}
cout << '\n' << "Codes to use: " << endl;
for (size_t i = 0; i < code_Assign.size(); i++)
cout << code_Assign[i] << endl;
cout << '\n' << "Possible Reading Frames: " << endl;
for (size_t i = 0; i < code_Stop.size(); i++)
cout << code_Stop[i] << endl;
cout << endl;
std::vector<std::string>::iterator longest = std::max_element(code_Stop.begin(), code_Stop.end, compare_length);
std::string longest_line = *longest; // retrieve return value
code_File.close();
}
else cout << "Cannot open File.";
to try and clarify my current output is this all in the code_Stop vector:
Possible Reading Frames:
AUG GGC CUC GAG ACC CGG GUU UAA AGU AGG
AUG GGC CUC GAG ACC CGG GUU
AUG AAA UUU GGG CCC AGA GCU CCG GGU AGC GCG UUA CAU
and I would just like to get the longest line.
Note, I am just learning about vectors, so please be kind...I have been getting a lot of help from this board and really much appreciated.
Ed. I changed the code to show where I have put it and it is giving me "Program received signal: 'EXC_BAD_ACCESS'". What have I done?
This should work:
#include <string>
#include <vector>
#include <algorithm>
bool compare_length(std::string const& lhs, std::string const& rhs) {
return lhs.size() < rhs.size();
}
int main() {
std::vector<std::string> lines; // fill with data
std::vector<std::string>::iterator longest = std::max_element(
lines.begin(), lines.end(),
compare_length);
std::string longest_line = *longest; // retrieve return value
}
compare_length is a function that compares the length of two given strings. It returns true if the first string is shorter than the second one, and false otherwise.
std::max_element is a standard-algorithm that find the largest element in a sequence using the specified comparison-function. lines.begin() and lines.end() return iterators to the beginning and the end of the sequence lines, thus specifying the range the algorithm should scan.