I made this problem by myself!
I'm reading a file, in C, where each line contains a number (random between 0 to 1000000):
1121
84
928434
9999
70373
...
I read line by line, and for each line, I do some calculation and write a big chuck of data into a file named d_file.txt where d is the list significant digit of the read number. Assume writing in file takes a long time, so I want to write a the code in a multi-thread so I can write in multiple files (~10) at the same time. While the single thread C code is obvious, I'm wondering how multi-thread code using pthread looks like.
single-thread C code:
#include <fstream>
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
int func(int a)
{
//assume the data is big and writing takes a long time
int data = a;
return data;
}
int main()
{
ifstream in("numbers.txt");
int a;
while(in >> a)
{
stringstream ss;
ss << a%10;
string str;
ss >> str;
str += "_File.txt";
ofstream out(str.c_str(), fstream::in | fstream::out | fstream::trunc);
//This is blocking, if write takes long
//but can be rewritten in a multi-thread fashion
// to allow upto 10 simultaneous file write
out << func(a) << endl;
}
return 0;
}
You can definitely read a file, and multiple sections of a file, at the same time. Check out this SO answer. If that isn't enough for you there are lots more on SO and across the web explaining how to both read and write to ASCII in parallel.
Related
I am curious to know from the more experienced c++ programmers out there if there is a way to have a function read in files that have different formats. Example is one file has a template of house # then street name and the other file is the opposite, house name then street #. Again, just curious if there was some sleek code that I could add to my knowledge and tool box, instead of writing two different inFile functions.
Thank you all for your time.
EDIT
Here is some of the code:
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
struct WH
{
int inum;
string iname;
string warname;
int quant;
float whole;
float markup;
float retail;
};
void readinprime(WH);
ifstream inFile;
ofstream outFile;
int main()
{
WH ware[100];
//inFile.open("WLL1.txt", ios::in);
//inFile.open("WLL2.txt", ios::in);
//inFile.open("WLL3.txt", ios::in);
//inFile.open("WLL4.txt", ios::in);
return 0;
}
void readinprime(WH ware[])
{
int c;
for(c = 0; c < 100; c++)
{
inFile << ware.inum[c] << ware.iname[c];
}
}
So essentially the first file (WLL1.txt) has the format integer->string and then the next file (WLL2.txt) will have the format string->integer. My question is, is there another way to write the read in function where it can read in int then string & string then int without writing another function? I dont mind writing another function for every file format, but I was just curious if someone had some good tricks that I could add to my tool box. Again thank you for your time.
you can use the getline method to read every line from the text and store it in a variable
First, I would like to express that I come to post my question, after a lot of searching on the internet, without finding a proper article or solution to what I'm looking for.
As mentioned in the title, I need to convert an ASCII file to Binary file.
My file is composed of lines, every line contain float separated by space.
I found that many people use c++ since it's more easy for this kind of task.
I tried the following code, but the generated file is so big.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main(int argc, char const *argv[])
{
char buffer;
ifstream in("Points_in.txt");
ofstream out("binary_out.bin", ios::out|ios::binary);
float nums[9];
while (!in.eof())
{
in >> nums[0] >> nums[1] >> nums[2]>> nums[3] >> nums[4] >> nums[5]>> nums[6] >> nums[7] >> nums[8];
out.write(reinterpret_cast<const char*>(nums), 9*sizeof(float));
}
return 0;
}
I found those 2 resources :
http://www.eecs.umich.edu/courses/eecs380/HANDOUTS/cppBinaryFileIO-2.html
https://r3dux.org/2013/12/how-to-read-and-write-ascii-and-binary-files-in-c/
I appreciate if you have any others resources ?
lines in my ASCII input file are as below :
-16.505 -50.3401 -194 -16.505 -50.8766 -193.5 -17.0415 -50.3401 -193.5
Thank you for your time
Here is a simpler method:
#include <iostream>
#include <fstream>
int main()
{
float value = 0.0;
std::ifstream input("my_input_file.txt");
std::ofstream output("output.bin");
while (input >> value)
{
output.write(static_cast<char *>(&value), sizeof(value));
}
input.close(); // explicitly close the file.
output.close();
return EXIT_SUCCESS;
}
In the above code fragment, a float is read using formatted read into a variable.
Next, the number is output in its raw, binary form.
The reading and writing repeat until there is no more input data.
Exercises for the reader/OP:
1. Error handling for opening of files.
2. Optimize the reading and writing (read & write using bigger blocks of data).
I'm new here. Trying to do something I think should be easy but can't get to work. I have two files which have just simple data in
FileA
KIC
757137
892010
892107
892738
892760
893214
1026084
1435467
1026180
1026309
1026326
1026473
1027337
1160789
1161447
1161618
1162036
3112152
1163359
1163453
1163621
3123191
1164590
and File B
KICID
1430163
1435467
1725815
2309595
2450729
2837475
2849125
2852862
2865774
2991448
2998253
3112152
3112889
3115178
3123191
�
I'd like to read both files, and then print out the values that are the same, and ignoring titles. In this case I'd get that 1435467 3123191 are in both, and just these would be sent to a new file.
so far I have
#include <cmath>
#include <cstdlib>
#include <string>
#include <iomanip>
#include <iostream>
#include <fstream>
#include <ctime>
using namespace std;
// Globals, to allow being called from several functions
// main program
int main() {
float A, B;
ifstream inA("FileA"); // input stream
ifstream inB("FileB"); // second instream
ofstream outA("OutA.txt"); // output stream
while (inA >> A) {
while (inB >> B) {
if (A == B) {
outA << A << "\t" << B << endl;
}
}
}
return 0;
}
And this just produces an empty document OutA
I thought this would read a line of FileA, then cycle through FileB until it found a match, send to OutA, and then move onto the next line of FileA
Any help would be appreciated?
You need to put
inB.seekg(0, inB.beg)
to the end of the outer while loop. Else you will stay at the end of inB and will read nothing after processing of the first entry of inA
Another problem may be that you are using float for A and B. Try int (or string), as float may not behave as you expect with ==.
Refer to this question for details: What is the most effective way for float and double comparison?.
This code worked on my platform:
...
while (inA >> A) {
inB.clear();
inB.seekg(0, inB.beg);
while (inB >> B) {
if (A == B) {
outA << A << "\t" << B << endl;
}
}
}
Notice the inB.clear() and inB.seekg(...), A and B are strings.
By the way, this method only good for quick-and-dirty implementation, it's not optimal for big files, as you get N * M complexity (N - size of FileA, M - size of FileB). By using hash set you may get to nearly linear (N + M) complexity.
Example of hash set implementation (C++11):
#include <string>
#include <iostream>
#include <fstream>
#include <unordered_set>
using namespace std;
int main() {
string A, B;
ifstream inA("FileA"); // input stream
ifstream inB("FileB"); // second instream
ofstream outA("OutA.txt"); // output stream
unordered_set<string> setA;
while (inA >> A) {
setA.insert(A);
}
while (inB >> B) {
if (setA.count(B)) {
outA << A << "\t" << B << endl;
}
}
return 0;
}
Are both the files small enough to read into memory?
You could try something similar to the following:
int main(int argc, char**argv)
{
std::vector<std::string> a;
std::vector<std::string> b;
ofstream outA("OutA.txt"); // output stream
ifstream inA("FileA"); // input stream
ifstream inB("FileB"); // second instream
std::string value;
inA >> value; //read first line (and don't use - discarding header)
while (inA >> A) { a.push_back(A);} //populate first vector
inB >> value; //read first line (and don't use - discarding header)
while (inB >> B) { b.push_back(B);} //populate first vector
//std::sort will perform a pretty efficient sort
std::sort(a.begin(),a.end());
std::sort(b.begin(),b.end());
//now that it is sorted, comparing is easier
for (std::vector<std::string>::iterator ita=a.begin(), std::vector<std::string>::iterator itb=b.begin(); ita!=a.end(), itb!=b.end();)
{
if(*ita > *itb)
itb++;
else if(*ita < *itb)
ita++;
else
outA << *ita <<'\n';
}
return 0;
}
Reads both files into memory, sorts them both, and then compares them.
The comparison only has to go through each file once, which reduces the complexity immensely O(a+b) instead of O(a*b). Of course the sorting will have an overhead, but this should be more efficient for larger files, and for shorter files it should be sufficiently fast still. (unless comparing lots and lots (and lots) of small files).
I believe with std::sort the worst case for all this is O(aloga + blogb) which is better than O(a*b)
In the end I fixed it like so
#include <cmath>
#include <cstdlib>
#include <string>
#include <iomanip>
#include <iostream>
#include <fstream>
#include <ctime>
using namespace std;
//Globals, to allow being called from several functions
//main program
int main() {
string A, B;
ifstream inA("FileA.txt"); //input stream
ifstream inB("FileB.txt") ;//second instream
ofstream outA("OutA.txt"); //output stream
while(inA>>A){//take in first stream
while(inB>>B){//whilst thats happening take in second stream
if (A==B){//do they match? If so then send out the value
outA<<A<<"\t"<<B<<endl; //THIS IS JUST SHOW A DOES = B!
}
}//end of B loop
inB.clear();//now clear the second stream (B)
inB.seekg(0, inB.beg);//return to start of stream B
}//move onto second input in stream A, and repeat
return 0;
}
I am trying to run this but the file is constantly failing to load. What I am trying to do is load a dictionary into an Array with each level of an array accounting for one word.
#include <iostream>
#include <string>
#include <fstream>>
#include <time.h>
#include <stdlib.h>
using namespace std;
int Rand;
void SetDictionary(){
srand(time(NULL));
Rand = rand() % 235674;
fstream file("Hangman.txt");
if(file.is_open()){
string Array[235675];
for(int X = 0; X < 235673; X++){
file >> Array[X];
}
cout << Array[Rand];
}else{
cout << "Unable To Open File\n";
}
}
int main(){
SetDictionary();
}
vector<string> words;
{
ifstream file("Hangman.txt");
string word;
while (file >> word)
{
words.push_back(word);
}
}
string randword = words[rand() % words.size()];
At first, I see you do not reuse Array after cout << Array[Rand] is done. You do not need array at all in this case. Read the file line by line into temp variable and cout this variable if condition X==Rand, then break.
At second, the implementation could be improved. Assumed you are trying to cout random word from file. It would be 1000-times faster to generate Rand as 0..file-size, then offset to this Rand. Now you are "inside" desired word and the task is to read back and forward for the work begin and end respectively. This algorithm will show a bit different probability distribution.
At third. If you plan to reuse file data, it would be much faster to read whole file into memory, and then do split by words, storing words offsets as arrays of integers.
At last. With really huge dictionaries (or if the program run on limited memory) it is possible to store words offsets only, and re-read dictionary contents on-the-fly.
I'm new to C++, and I'm trying to write a short C++ program that reads lines of
text from a file, with each line containing one integer key and one alphanumeric string value (no embedded whitespace). The number of lines is not known in advance, (i.e., keep reading lines until end of file is reached). The program needs to use the 'std::map' data structure to store integers and strings read from input (and to associate integers with strings). The program then needs to output string values (but not integer values) to standard output, 1 per line, sorted by integer key values (smallest to largest). So, for example, suppose I have a text file called "data.txt" which contains the following three lines:
10 dog
-50 horse
0 cat
-12 zebra
14 walrus
The output should then be:
horse
zebra
cat
dog
walrus
I've pasted below the progress I've made so far on my C++ program:
#include <fstream>
#include <iostream>
#include <map>
using namespace std;
using std::map;
int main ()
{
string name;
signed int value;
ifstream myfile ("data.txt");
while (! myfile.eof() )
{
getline(myfile,name,'\n');
myfile >> value >> name;
cout << name << endl;
}
return 0;
myfile.close();
}
Unfortunately, this produces the following incorrect output:
horse
cat
zebra
walrus
If anyone has any tips, hints, suggestions, etc. on changes and revisions
I need to make to the program to get it to work as needed, can you please
let me know?
Thanks!
See it:
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string name;
int value;
ifstream myfile("text.txt", ifstream::in);
while(myfile >> value >> name)
cout << name << endl;
return 0;
}
You are having problems because you attempt to read each line twice: first with getline and then with operator>>.
You haven't actually used std::map in any regard, at all. You need to insert the integer/string pair into the map, and then iterate over it as the output. And there's no need to close() the stream.
Instead of using "! myfile.eof()" use this code it will help.
ifstream is;
string srg;
is.open(filename);
while(getline(is,srg))
{//your code
}