compare two files and send out equal values - c++

I'm new here. Trying to do something I think should be easy but can't get to work. I have two files which have just simple data in
FileA
KIC
757137
892010
892107
892738
892760
893214
1026084
1435467
1026180
1026309
1026326
1026473
1027337
1160789
1161447
1161618
1162036
3112152
1163359
1163453
1163621
3123191
1164590
and File B
KICID
1430163
1435467
1725815
2309595
2450729
2837475
2849125
2852862
2865774
2991448
2998253
3112152
3112889
3115178
3123191
�
I'd like to read both files, and then print out the values that are the same, and ignoring titles. In this case I'd get that 1435467 3123191 are in both, and just these would be sent to a new file.
so far I have
#include <cmath>
#include <cstdlib>
#include <string>
#include <iomanip>
#include <iostream>
#include <fstream>
#include <ctime>
using namespace std;
// Globals, to allow being called from several functions
// main program
int main() {
float A, B;
ifstream inA("FileA"); // input stream
ifstream inB("FileB"); // second instream
ofstream outA("OutA.txt"); // output stream
while (inA >> A) {
while (inB >> B) {
if (A == B) {
outA << A << "\t" << B << endl;
}
}
}
return 0;
}
And this just produces an empty document OutA
I thought this would read a line of FileA, then cycle through FileB until it found a match, send to OutA, and then move onto the next line of FileA
Any help would be appreciated?

You need to put
inB.seekg(0, inB.beg)
to the end of the outer while loop. Else you will stay at the end of inB and will read nothing after processing of the first entry of inA

Another problem may be that you are using float for A and B. Try int (or string), as float may not behave as you expect with ==.
Refer to this question for details: What is the most effective way for float and double comparison?.
This code worked on my platform:
...
while (inA >> A) {
inB.clear();
inB.seekg(0, inB.beg);
while (inB >> B) {
if (A == B) {
outA << A << "\t" << B << endl;
}
}
}
Notice the inB.clear() and inB.seekg(...), A and B are strings.
By the way, this method only good for quick-and-dirty implementation, it's not optimal for big files, as you get N * M complexity (N - size of FileA, M - size of FileB). By using hash set you may get to nearly linear (N + M) complexity.
Example of hash set implementation (C++11):
#include <string>
#include <iostream>
#include <fstream>
#include <unordered_set>
using namespace std;
int main() {
string A, B;
ifstream inA("FileA"); // input stream
ifstream inB("FileB"); // second instream
ofstream outA("OutA.txt"); // output stream
unordered_set<string> setA;
while (inA >> A) {
setA.insert(A);
}
while (inB >> B) {
if (setA.count(B)) {
outA << A << "\t" << B << endl;
}
}
return 0;
}

Are both the files small enough to read into memory?
You could try something similar to the following:
int main(int argc, char**argv)
{
std::vector<std::string> a;
std::vector<std::string> b;
ofstream outA("OutA.txt"); // output stream
ifstream inA("FileA"); // input stream
ifstream inB("FileB"); // second instream
std::string value;
inA >> value; //read first line (and don't use - discarding header)
while (inA >> A) { a.push_back(A);} //populate first vector
inB >> value; //read first line (and don't use - discarding header)
while (inB >> B) { b.push_back(B);} //populate first vector
//std::sort will perform a pretty efficient sort
std::sort(a.begin(),a.end());
std::sort(b.begin(),b.end());
//now that it is sorted, comparing is easier
for (std::vector<std::string>::iterator ita=a.begin(), std::vector<std::string>::iterator itb=b.begin(); ita!=a.end(), itb!=b.end();)
{
if(*ita > *itb)
itb++;
else if(*ita < *itb)
ita++;
else
outA << *ita <<'\n';
}
return 0;
}
Reads both files into memory, sorts them both, and then compares them.
The comparison only has to go through each file once, which reduces the complexity immensely O(a+b) instead of O(a*b). Of course the sorting will have an overhead, but this should be more efficient for larger files, and for shorter files it should be sufficiently fast still. (unless comparing lots and lots (and lots) of small files).
I believe with std::sort the worst case for all this is O(aloga + blogb) which is better than O(a*b)

In the end I fixed it like so
#include <cmath>
#include <cstdlib>
#include <string>
#include <iomanip>
#include <iostream>
#include <fstream>
#include <ctime>
using namespace std;
//Globals, to allow being called from several functions
//main program
int main() {
string A, B;
ifstream inA("FileA.txt"); //input stream
ifstream inB("FileB.txt") ;//second instream
ofstream outA("OutA.txt"); //output stream
while(inA>>A){//take in first stream
while(inB>>B){//whilst thats happening take in second stream
if (A==B){//do they match? If so then send out the value
outA<<A<<"\t"<<B<<endl; //THIS IS JUST SHOW A DOES = B!
}
}//end of B loop
inB.clear();//now clear the second stream (B)
inB.seekg(0, inB.beg);//return to start of stream B
}//move onto second input in stream A, and repeat
return 0;
}

Related

c++ String from file to vector - more elegant way

I write a code in which I want to pass several strings from text file to string vector. Currently I do this that way:
using namespace std;
int main()
{
string list_name="LIST";
ifstream REF;
REF.open(list_name.c_str());
vector<string> titles;
for(auto i=0;;i++)
{
REF>>list_name;
if(list_name=="-1"){break;}
titles.push_back(list_name);
}
REF.close();
cout<<titles.size();
for(unsigned int i=0; i<titles.size(); i++)
{
cout<<endl<<titles[i];
}
It works fine, I get the output as expected. My concern is is there more elegant way to pass string from text file to vector directly, avoiding this fragment, when passing string from filestream to string object and assigning it to the vector with push_back as separate step:
REF>>list_name;
if(list_name=="-1"){break;}
titles.push_back(list_name);
More elegant way with algorithms
std::copy_if(std::istream_iterator<std::string>(REF),
std::istream_iterator<std::string>(),
std::back_inserter(titles),
[](const std::string& t) { return t != "-1"; });
The other answers are maybe too complicated or too complex.
Let me first do a small review of your code. Please see my comments within the code:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
using namespace std; // You should not open the full std namespace. Better to use full qualifiacation
int main()
{
string list_name = "LIST";
ifstream REF; // Here you coud directly use the construct ofr the istream, which will open the file for you
REF.open(list_name.c_str()); // No need to use c_str
vector<string> titles; // All variables should be initialized. Use {}
for (auto i = 0;; i++) // Endless loop. You could also write for(;;), but bad design
{
REF >> list_name;
if (list_name == "-1") { break; } // Break out of the endless loop. Bad design. Curly braces not needed
titles.push_back(list_name);
}
REF.close(); // No nbeed to close the file. With RAII, the destructor of the istream will close the file for you
cout << titles.size();
for (unsigned int i = 0; i < titles.size(); i++) // Better to use a range based for loop
{
cout << endl << titles[i]; // end not recommended. For cout`'\n' is beter, because it does not call flush unneccesarily.
}
}
You see many points for improvement.
Let me explain some of the more important topics to you.
You should use the std::ifstreams constructor to directly open the file.
Always check the result of such an operation. The bool and ! operator for the std::ifstream are overwritten. So a simple test can be done
Not need to close the file. The Destructor of the std::ifstream will do that for you.
There is a standard approach on how to read a file. Please see below.
If you want to read file until EOF (end of file) or any other condition, you can simply use a while loop and call the extraction operator >>
For example:
while (REF >> list_name) {
titles.push_back(list_name);
}
Why does this work? The extraction operator will always return a reference to the stream with what it was called. So, you can imagine that after reading the string, the while would contain while (REF), because REF was returned by (REF >> list_name. And, as mentioned already, the bool operator of the stream is overwritten and returns the state of the stream. If there would be any error or EOF, then if (REF) would be false.
So and now the additional condition: A comparison with "-1" can be easily added to the while statement.
while ((REF >> list_name) and (list_name != "-1")) {
titles.push_back(list_name);
}
This is a safe operatrion, because of boolean short-cut evaluation. If the first condition is already false, the second will not be evaluated.
With all the knwo-how above, the code could be refactored to:
#include <iostream>
#include <fstream>
#include <string>
#include <vector>
int main() {
// Here our source data is stored
const std::string fileName{ "list.txt" };
// Open the file and check, if it could be opened
std::ifstream fileStream{ fileName };
if (fileStream) {
// Here we will store all titles that we read from the file
std::vector<std::string> titles{};
// Now read all data and store vit in our resulting vector
std::string tempTitle{};
while ((fileStream >> tempTitle) and (tempTitle != "-1"))
titles.push_back(tempTitle);
// For debug purposes. Show all titles on screen:
for (const std::string title : titles)
std::cout << '\n' << title;
}
else std::cerr << "\n*** Error: Could not open file '" << fileName << "'\n";
}
If you knew the number of strings to read beforehand, you could
using StringVector = std::vector<std::string>;
int main(int argc, const char* argv) {
constexpr size_t N = 4; // or however many strings you want...
StringVector data(N);
std::ifstream stream("foo.txt");
for (size_t i =0; (i < N) && stream; i++) {
stream >> data[i];
}
}
But this would be less flexible and it would be trickier to implement your "-1" "terminator" convention.
If that "-1" thing is a true requirement (in contrast to an arbitrary choice), and if you use this more than once, it might pay off to "abstract", how you read those strings. Abstraction is usually done in form of a function.
// compile with:
// clang++-13 -std=c++20 -g -O3 -o words words.cpp
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
using StringVector = std::vector<std::string>;
std::istream& operator>> (std::istream& stream, StringVector& sv)
{
std::string word;
while (stream) {
stream >> word;
if (word == "-1")
return stream;
sv.push_back(word);
}
return stream;
}
std::ostream& operator<< (std::ostream& stream,
const StringVector& sv) {
for (const auto& s : sv) {
stream << s << std::endl;
}
return stream;
}
int main(int argc, const char* argv[]) {
std::string file_data{R"(word1 word2
word3
word4 -1)"};
std::istringstream stream(file_data);
StringVector data;
data.reserve(10);
stream >> data;
std::cout
<< "Number of strings loaded: "
<< data.size() << std::endl;
std::cout << data;
return 0;
}
The above operator>>() works for streams in general, so it also works for file streams.
As an aside: One reason, why people would not like the "-1" terminator approach is performance. If you keep pushing into a vector an arbitrary amount of times, the storage of the vector needs to be re-allocated as the vector grows, which is avoidable overhead. So, usually people would use another file format, e.g. giving the number of strings first, then the strings, which would allow for:
size_t n;
stream >> n;
StringVector data;
data.reserve(n); // avoids "spurious reallocs as we load the strings"
for (size_t i = 0; i < n; i++) { ... }

How to read an array of complex numbers from a text file in C++

As a learner in c++, I decided to play with complex numbers, using the standard library. Now I need to read and write an array of complex from/to text files. This works simply for writing, without supplemental tricks :
void dump(const char *filename){
ofstream result;
result.open (filename);
for(int k=0;k<15;k++){
result<< outputs[k] <<endl;
}
result.close();
}
The data are parenthesized and written line by line looking like : (real,im)...
Now, I guess reading (and loading an array of complex) should be as trivial as reading. However, despite my research, I have not found the right way to do that.
My first attempt was naive :
void readfile(const char *filename){
string line;
ifstream myfile (filename);
if (myfile.is_open())
{
int k=0;
while ( getline (myfile,line) ){
k++;
cout << line << endl;
inputs[k]= (complex<float>) line; //naive !
}
myfile.close();
}
else cout << "Unable to open file";
}
Is there a way to do that simply (without a string parser ) ?
Assuming you have an operator<< for your_complex_type (as has been mentioned, std::complex provides one), you can use an istream_iterator:
#include <fstream>
#include <iterator>
#include <vector>
int main()
{
std::ifstream input( "numbers.txt" );
std::vector<your_complex_type> buffer{
std::istream_iterator<your_complex_type>(input),
std::istream_iterator<your_complex_type>() };
}
This will read all numbers in the file and store them in an std::vector<your_complex_type>.
Edit about your comment
If you know the number of elements you will read up-front, you can optimize this as follows:
#include <fstream>
#include <iterator>
#include <vector>
int main()
{
std::ifstream input( "numbers.txt" );
std::vector<your_complex_type> buffer;
buffer.reserve(expected_number_of_entries);
std::copy(std::istream_iterator<your_complex_type>(input),
std::istream_iterator<your_complex_type>(),
std::back_inserter(buffer));
}
std::vector::reserve will make the vector reserve enough memory to store the specified number of elements. This will remove unnecessary reallocations.
You can also use similar code to write your numbers to a file:
std::vector<your_complex_type> numbers; // assume this is filled
std::ofstream output{ "numbers.txt" };
std::copy(std::begin(numbers), std::end(numbers),
std::ostream_iterator<your_complex_type>(output, '\n') );
C++ version:
std::complex<int> c;
std::ifstream fin("filename");
fin>>c;
C version:
int a,b;
FILE *fin=fopen("filename","r");
fscanf(fin,"(%d,%d)\n",&a,&b);
C++ read multiple lines with multiple complex values on each line
#include <stdio.h>
#include <fstream>
#include <complex>
#include <iostream>
#include <sstream>
int main ()
{
std::complex<int> c;
std::ifstream fin("test.in");
std::string line;
std::vector<std::complex<int> > vec;
vec.reserve(10000000);
while(std::getline(fin,line))
{
std::stringstream stream(line);
while(stream>>c)
{
vec.push_back(c);
}
}
return 0;
}

Printing out blank spaces from a text file in C++

#include <iostream>
#include <cstdlib>
#include <cctype>
#include <cmath>
#include <string>
#include <iomanip>
#include <fstream>
#include <stdio.h>
using namespace std;
int main()
{
ifstream file;
string filename;
char character;
int letters[153] = {};
cout << "Enter text file name: ";
cin >> filename;
file.open(filename.c_str());
if (! file.is_open())
{
cout << "Error opening file. Check file name. Exiting program." << endl;
exit(0);
}
while (file.peek() != EOF)
{
file >> character;
if(!file.fail())
{
letters[static_cast<int>(character)]++;
}
}
for (int i = 0; i <= 153; i++)
{
if (letters[i] > 0)
{
cout << static_cast<char>(i) << " " << letters[i] << endl;
}
}
exit(0);
}
#endif
Hi everyone, my current code counts the frequency of each letter from a text file. However, it does not count the number of blank spaces. Is there a simple way to printout the number of blank spaces in a .txt file?
Also, how come when I'm trying to access a vector item, I run into a seg fault?
For example, if I use:
cout << " " + letters[i] << endl;, it displays a segfault. Any ideas?
Thank you so much.
By default, iostreams formatted input extraction operations (those using >>) skip past all whitespace characters to get to the first non-whitespace character. Perhaps surprisingly, this includes the extraction operator for char. In order to consider whitespace characters as characters to be processed as usual, you should alter use the noskipws manipulator before processing:
file << std::noskipws;
Don't forget to set it back on later:
file << std::skipws;
What if you're one of those crazy people who wants to make a function that leaves this aspect (or in even all aspects) of the stream state as it was before it exits? Naturally, C++ provides a discouragingly ugly way to achieve this:
std::ios_base::fmtflags old_fmt = file.flags();
file << std::noskipws;
... // Do your thang
file.flags(old_fmt);
I'm only posting this as an alternative way of doing what you're apparently trying. This uses the same lookup table approach you use in your code, but uses an istreambuf_iterator for slurping unformatted (and unfiltered) raw characters out of the stream buffer directly.
#include <iostream>
#include <fstream>
#include <iterator>
#include <climits>
int main(int argc, char *argv[])
{
if (argc < 2)
return EXIT_FAILURE;
std::ifstream inf(argv[1]);
std::istreambuf_iterator<char> it_inf(inf), it_eof;
unsigned int arr[1 << CHAR_BIT] = {};
std::for_each(it_inf, it_eof,
[&arr](char c){ ++arr[static_cast<unsigned int>(c)];});
for (int i=0;i<sizeof(arr)/sizeof(arr[0]);++i)
{
if (std::isprint(i) && arr[i])
std::cout << static_cast<char>(i) << ':' << arr[i] << std::endl;
}
return 0;
}
Executing this on the very source code file itself, (i.e. the code above) generates the following:
:124
#:4
&:3
':2
(:13
):13
*:1
+:4
,:4
/:1
0:3
1:2
2:1
::13
;:10
<:19
=:2
>:7
A:2
B:1
C:1
E:2
F:1
H:1
I:3
L:1
R:2
T:2
U:1
X:1
[:8
]:8
_:10
a:27
b:1
c:19
d:13
e:20
f:15
g:6
h:5
i:42
l:6
m:6
n:22
o:10
p:1
r:37
s:20
t:34
u:10
v:2
z:2
{:4
}:4
Just a different way to do it, but hopefully it is clear that usually the C++ standard library offers up elegant ways to do what you desire if you dig deep enough to find whats in there. Wishing you good luck.

How To Parse String File Txt Into Array With C++

I am trying to write a C++ program, but I am not familiar with C++. I have a .txt file, which contains values as follows:
0
0.0146484
0.0292969
0.0439453
0.0585938
0.0732422
0.0878906
What I have done in my C++ code is as follows:
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myReadFile;
myReadFile.open("Qi.txt");
if(myReadFile.is_open())
{
while(myReadFile.good())
{
getline(myReadFile,line);
cout << line << endl;
}
myReadFile.close();
}
return 0;
}
I would like to make the output of the program an array, i.e.
line[0] = 0
line[1] = 0.0146484
line[2] = 0.0292969
line[3] = 0.0439453
line[4] = 0.0585938
line[5] = 0.0732422
line[6] = 0.0878906
Assuming you want your data stored as floating point numbers (not strings) you probably want to do something like this:
#include <iostream>
#include <vector>
#include <iterator>
#include <fstream>
int main() {
std::ifstream in("Qi.txt");
// initialize the vector from the values in the file:
std::vector<double> lines{ std::istream_iterator<double>(in),
std::istream_iterator<double>() };
// Display the values:
for (int i=0; i<lines.size(); i++)
std::cout << "lines[" << i << "] = " << lines[i] << '\n';
}
Just a quick note on style: I prefer to see variables fully initialized right when you create them, so std::ifstream in("Qi.txt"); is preferable to std::ifstream in; in.open("Qi.txt");. Likewise, it's preferable to initialize the vector of lines directly from istream iterators rather than create an empty vector, then fill it in an explicit loop.
Finally, note that if you insist on writing an explicit loop anyway, you never want to use something like while (somestream.good()) or while (!somestream.eof()) to control your loop -- these are mostly broken, so they don't (dependably) read a file correctly. Depending on the type of data involved, they'll frequently appear to read the last item from the file twice. Usually, you want something like while (file >> value) or while (std::getline(file, somestring)). These check the state of the file immediately after reading, so as soon as reading fails they fall out of the loop, avoiding the problems of the while (good()) style.
Oh, as a side note: this is written expecting a compiler that (at lest sort of) conforms with C++11. For an older compiler you'd want to change this:
// initialize the vector from the values in the file:
std::vector<double> lines{ std::istream_iterator<double>(in),
std::istream_iterator<double>() };
...to something like this:
// initialize the vector from the values in the file:
std::vector<double> lines(( std::istream_iterator<double>(in)),
std::istream_iterator<double>() );
First you'll need a vector:
std::vector<std::string> lines; // requires #include <vector>
Then you'll need to take a string taken from the getline operation, and push it back into the vector. It's very simple:
for (std::string line; std::getline(myReadFile, line);)
{
lines.push_back(line);
}
For an output operation, all you need is:
{
int i = 0;
for (auto a : lines)
{
std::cout << "lines[" << i++ << "] = " << a << std::endl;
}
}
#include <iostream>
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myReadFile;
myReadFile.open("Qi.txt");
if(myReadFile.is_open())
{
for(int i=0;i<7;++i)
if(myReadFile.good())
{
getline(myReadFile,line);
cout<<"line["<<i<<"] = " << line << endl;
}
myReadFile.close();
}
return 0;
}

Array of Ofstream in c++

I want 41 output files to use in my project to write text on them. first create a string array list to name those output files then I tried to define an array of ofstream objects and use list to name them, but I get this error that 'outfile' cannot be used as a function. Below is my code:
#include <sstream>
#include <string>
#include <iostream>
#include <fstream>
using namespace std ;
int main ()
{
string list [41];
int i=1;
ofstream *outFile = new ofstream [41];
for (i=1;i<=41 ;i++)
{
stringstream sstm;
sstm << "subnode" << i;
list[i] = sstm.str();
}
for (i=0;i<=41;i++)
outFile[i] (list[i].c_str());
i=1;
for (i=1;i<=41;i++)
cout << list[i] << endl;
return 0;
}
See below for the following fixes:
don't use new unless you have to (you were leaking all files and not properly destructing them will lead to lost data; ofstreams might not be flushed if you don't close them properly, and the pending output buffer will be lost)
Use proper array indexing (starting from 0!)
Call .open(...) on a default-constructed ofstream to open a file
Recommendations:
I'd recommend against using namespace std; (not changed below)
I recommend reusing the stringstream. This is is good practice
Prefer to use C++-style loop index variables (for (int i = ....). This prevents surprises from i having excess scope.
In fact, get with the times and use ranged for
#include <sstream>
#include <string>
#include <iostream>
#include <fstream>
using namespace std;
int main ()
{
ofstream outFile[41];
stringstream sstm;
for (int i=0;i<41 ;i++)
{
sstm.str("");
sstm << "subnode" << i;
outFile[i].open(sstm.str());
}
for (auto& o:outFile)
cout << std::boolalpha << o.good() << endl;
}
You can not call the constructor as you do. Try calling outFile[i].open(list[i].c_str()). Note the 'open'.