I have a huge file full of numbers (2 billion). So this is a file splitter that splits my file in groups of 100000 numbers. but this is returning empty files full of spaces and enters. I even tried to change the data type of the variable. I am struck. please suggest.
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <fstream>
#include <algorithm>
using namespace std;
int main()
{
std::ifstream ifs("prime.txt");
unsigned long long int curr;
unsigned long long int x = 0;
string li;
int count;
while (getline(ifs, li))
{
count ++;
}
ifs.seekg(ios::beg);
string v;
while (curr < count)
{
x++;
std::string file = to_string(x) ;
std::string filename = "splitted\\"+file+ ".txt";
std::ofstream ofile (filename.c_str());
while (curr < 100000*x )
{
ifs >> v ;
ofile << v << "\n";
curr++;
}
ofile.close();
}
}
You have 2 uninitialised variables, count and curr, your compiler should have warned you about these. If it didn't make sure you have enabled compiler warnings.
After the last getline in your initial while loop fails the stream will have the fail and eof flags set. Due to the stream not being in the good state all further operations on it will fail so your seekg will be ignored as will all your reads. To fix this call ifs.clear(); before your seekg.
As you don't seem to need the line count anywhere pre-calculating it is unnecessary and as you are then not reading the file in lines it will lead to incorrect behaviour if your file has more than one value on a line. Your code can be simplified to:
#include <iostream>
#include <vector>
#include <string>
#include <iterator>
#include <fstream>
#include <algorithm>
using namespace std;
int main()
{
std::ifstream ifs("prime.txt");
if (!ifs)
{
std::cout << "error opening input file\n";
return 1;
}
int64_t fileNumber = 0;
int64_t fileCount = 0;
std::ofstream ofile;
while (ifs)
{
if (!ofile.is_open() || (fileCount >= 100000))
{
ofile.close();
fileCount = 0;
fileNumber++;
std::string file = to_string(fileNumber);
std::string filename = "splitted\\" + file + ".txt";
ofile.open(filename.c_str());
if (!ofile)
{
std::cout << "error opening output file\n";
return 1;
}
}
std::string value;
if (ifs >> value) {
ofile << value << "\n";
fileCount++;
}
}
}
Related
I must say I'm completely new to C++. I got the following problem.
I've got a text file which only has one 8 digits number
Text-File: "01485052"
I want to read the file and put all numbers into a vector, e.g. Vector v = ( 0, 1, 4, 8, 5, 0, 5, 2 ). Then write it into another text file.
How do I implement it the best way? That's what I made possible so far:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <sstream>
using namespace std;
int main()
{
char matrikelnummer[100];
cout << "Enter file name: ";
cin >> matrikelnummer;
// Declare input file stream variable
ifstream inputFile(matrikelnummer);
string numbers;
//Check if exists and then open the file
if (inputFile.good()) {
//
while (getline(inputFile, numbers))
{
cout << numbers;
}
// Close the file
inputFile.close();
}
else // In case TXT file does not exist
{
cout << "Error! This file does not exist.";
exit(0);
return 0;
}
// Writing solutions into TXT file called Matrikelnummer_solution.txt
ofstream myFile;
myFile.open("Matrikelnummer_solution.txt");
myFile << "Matrikelnummer: " << numbers << '\n';
myFile.close();
return 0;
}
You can use the following program for writing the number into another file and also into a vector:
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <sstream>
using namespace std;
int main()
{
ifstream inputFile("input.txt");
std::string numberString;
int individualNumber;
std::vector<int> vec;
if(inputFile)
{
std::ofstream outputFile("outputFile.txt");
while(std::getline(inputFile, numberString,'\n'))//go line by line
{
for(int i = 0; i < numberString.size(); ++i)//go character by character
{
individualNumber = numberString.at(i) - '0';
outputFile << individualNumber;//write individualNumber into the output file
vec.push_back(individualNumber);//add individualNumber into the vector
}
}
outputFile.close();
}
else
{
std::cout<<"input file cannot be openede"<<std::endl;
}
inputFile.close();
//print out the vector
for(int elem: vec)
{
std::cout<<elem<<std::endl;
}
return 0;
}
The output of the above program can be seen here.
Read from file to numbers using inputFile >> numbers. Then, add each digit character of the string to a std::vector.
Also, to write to the file, write each element of vector in a for loop.
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
#include <sstream>
using namespace std;
int main()
{
char matrikelnummer[100];
cout << "Enter file name: \n";
cin >> matrikelnummer;
// Declare input file stream variable
ifstream inputFile(matrikelnummer);
string numbers;
vector<int> individualNumbers;
//Check if exists and then open the file
if (inputFile.good()) {
inputFile >> numbers;
for (int i = 0; i < numbers.length(); i++) {
if (numbers[i] >= '0' && numbers[i] <= '9')
individualNumbers.push_back(numbers[i] - '0');
}
// Close the file
inputFile.close();
}
else // In case TXT file does not exist
{
cout << "Error! This file does not exist.";
exit(0);
return 0;
}
// Writing solutions into TXT file called Matrikelnummer_solution.txt
ofstream myFile;
myFile.open("Matrikelnummer_solution.txt");
myFile << "Matrikelnummer: ";
for (int number : individualNumbers) {
myFile << number << " ";
}
myFile << endl;
myFile.close();
return 0;
}
hi I am trying to read a specific line from a text file update that and put it back to the same line without affecting the other lines in c++
here I am trying to execute the code and values get added when I re-execute it
#include <iostream>
#include <stream>
#include <stdio.h>
#include <string>
#include <stream>
using namespace std;
void stringGen(char num){
ifstream ifile;
ifile.open("example1.txt");
if(ifile) {
int LINE = 5;
string line;
ifstream myfile1 ("example1.txt");
for (int i = 1; i <= LINE; i++)
getline(myfile1, line);
cout << line<<endl;
stringstream geek(line);
int num=0;
geek>>num;
if(num<61004){
num=num+1;
ofstream MyFile("example1.txt");//
MyFile.close();
}
else{
num=61001;
ofstream MyFile("example1.txt");//
MyFile << num;
MyFile.close();
}
}
else{
int num=61001;
cout<<num<<endl;
ofstream MyFile("example1.txt");//
MyFile << num+1;
MyFile.close();
}
}
int main (){
char num;
stringGen(num);
return 0;
}
At first, you need to understand, how files, with lines are stored. Simplified, it is a sequence of bytes, one byte after the other. There maybe some special characters in this byte sequence, which people can interprete as the end of a line, e.g. '\n'. But also other characters or even more than one character is possible:
If you look at the following text.
Hello1
World1
Hello2
World2
it maybe stored in a file like this:
Hello1\nWorld1\nHello2\nWorld2\n
And just because we interprete a '\n' as the end of the line, we can "see" lines in there.
So, if you want to modify a line, then you would need to find the start position of the thing that we interprete as a line in the file, and then modify some bytes.
That can of course only be done, if the length of the "line" will not change. Then you could use "seek" functions and overwrite the needed bytes.
In reality, nobody would do that. Normally, you would read "lines" of the file into some kind of memory buffer, then do the modification there and then write back all lines.
For example, you would define a std::vector and then read all lines, by using std::getline and push_back the lines in the std::vector.
The modifications will be done in the std::vector, and the all data will be written back to the file, overwriting all "old" data.
There are more answers to this question. If you have any more specific question, I will answer again
Some simple example code
#include <iostream>
#include <fstream>
#include <vector>
#include <string>
int main() {
// Here we will store all lines of the text file
std::vector<std::string> lines{};
// Open the text file for reading and check, if it could be opened
if (std::ifstream textfileStream{ "test.txt" }; textfileStream) {
// Read all lines into our vector
std::string oneLine{};
while (std::getline(textfileStream, oneLine)) {
// Add the just read line to our vector
lines.push_back(oneLine);
}
// For test purposes, modify the first line
if (not lines.empty()) lines[0] = "MODIFIED";
}
else std::cerr << "\nError: Could not open input text file\n";
// Write back data
// Open the text file for writing and check, if it could be opened
if (std::ofstream textfileStream{ "r:\\test.txt" }; textfileStream) {
// Iterate over all lines and wriite to file
for (const std::string& oneLine : lines)
textfileStream << oneLine << '\n';
}
else std::cerr << "\nError: Could not open output text file\n";
return 0;
}
#include <iostream>
#include <fstream>
#include <stdio.h>
#include <string>
#include <sstream>
using namespace std;
void stringGen(char num){
int count=0;
int a;
string line,check,linex;
string msg="Message_Handler:";
fstream ifile;
ifile.open("sample.txt",ios::in|ios::out);
if(ifile){
while(getline (ifile,line)) {
if (line.find("Message_Handler:") == 0){
check=line.substr(16,5);
count++;
a=ifile.tellp();
}
}
ifile.close();
if(count==0){
int num=61001;
cout<<num<<endl;
num=num+1;
ofstream examplefile ("sample.txt",ios::app);
examplefile<<"Message_Handler:"<<num;
examplefile.close();
}
if(count==1){
cout<<check<<endl;
stringstream geek(check);
int num=0;
geek>>num;
if(num<61004){
num=num+1;
stringstream ss;
ss << num;
string nums = ss.str();
fstream MyFile("sample.txt",ios::in|ios::out);
MyFile.seekp(a-5);
MyFile<<nums;
}
else{
int num=61001;
stringstream ss;
ss << num;
string nums = ss.str();
fstream MyFile("sample.txt",ios::in|ios::out);
MyFile.seekp(a-5);
MyFile<<nums;
}
}
}
else{
int num=61001;
cout<<num<<endl;
ofstream MyFile("sample.txt",ios::app);
MyFile <<"Message_Handler:"<< num+1;
MyFile.close();
}
}
int main (){
char num;
stringGen(num);
return 0;
}
/* In the text file I have a char followed by a blankspace then a string. I'm trying to read the char and string into seperated arrays. Any help is appreciated */
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
using namespace std;
int main()
{
char arrivOrDepart;
string licensePlt;
ifstream inFile;
inFile.open("Text.txt");
if (!inFile)
{
cout << "Can't open file" << endl;
return 1;
}
for (int i = 0; i < 4; i++)
{
getline(cin, arrivOrDepart[i]);
getline(cin, licensePlt[i]);
}
inFile.close();
cin.get();
return 0;
}
//text file
A QWE123
A ASD123
A ZXC123
A WER123
A SDF123
#include <fstream>
#include <iterator>
#include <vector>
this reads from file into vector
std::ifstream input("d:\\testinput.txt");
std::vector<std::string> bytes(
(std::istreambuf_iterator<std::string>(input)),
(std::istreambuf_iterator<std::string>()));
input.close();
then, just put the data into whatever container you want. you should almost always prefer vector over array btw
There are a few problems with the code:
getline is the wrong tool of choice for this. if you want to split a stream based on spaces, use >>.
arrivOrDepart and licensePlt are not defined as arrays but are used as arrays.
reading from cin, not from file.
My suggested fixes (excluding using vectors instead of arrays):
#include <iostream>
#include <fstream>
#include <string>
#include <cstdlib>
using namespace std; // avoid using this
int main()
{
const int MAXARRAY = 4; // avoid using magic numbers
char arrivOrDepart[MAXARRAY]; // made an array, but prefer std::vector
string licensePlt[MAXARRAY]; //made an array
ifstream inFile;
inFile.open("Text.txt");
if (!inFile)
{
cout << "Can't open file" << endl;
return 1;
}
string temp;
int i = 0;
while (i < MAXARRAY && // not overrunning the arrays
inFile >> temp >> licensePlt[i] && // read data from file stream
temp.length() == 1) // read only one character for arrivOrDepart
{
arrivOrDepart = temp[0];
i++;
}
inFile.close();
cin.get();
return 0;
}
Recommended reading:
Why is "using namespace std" considered bad practice?
What is a magic number, and why is it bad?
std::vector documentation (Alternate easier to read but often less accurate documentation)
std::getline documentation. Note the third parameter used to set the parsing delimiter.
I need to parse a table of numbers formatted as ascii text. There are 36 space delimited signed integers per line of text and about 3000 lines in the file. The input file is generated by me in Matlab so I could modify the format. On the other hand, I also want to be able to parse the same file in VHDL and so ascii text is about the only format possible.
So far, I have a little program like this that can loop through all the lines of the input file. I just haven't found a way to get individual numbers out of the line. I am not a C++ purest. I would consider fscanf() but 36 numbers is a bit much for that. Please suggest practical ways to get numbers out of a text file.
int main()
{
string line;
ifstream myfile("CorrOut.dat");
if (!myfile.is_open())
cout << "Unable to open file";
else{
while (getline(myfile, line))
{
cout << line << '\n';
}
myfile.close();
}
return 0;
}
Use std::istringstream. Here is an example:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
using namespace std;
int main()
{
string line;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
istringstream strm(line);
while ( strm >> num )
cout << num << " ";
cout << "\n";
}
}
Live Example
If you want to create a table, use a std::vector or other suitable container:
#include <sstream>
#include <string>
#include <fstream>
#include <iostream>
#include <vector>
using namespace std;
int main()
{
string line;
// our 2 dimensional table
vector<vector<int>> table;
istringstream strm;
int num;
ifstream ifs("YourData");
while (getline(ifs, line))
{
vector<int> vInt;
istringstream strm(line);
while ( strm >> num )
vInt.push_back(num);
table.push_back(vInt);
}
}
The table vector gets populated, row by row. Note we created an intermediate vector to store each row, and then that row gets added to the table.
Live Example
You can use a few different approaches, the one offered above is probable the quickest of them, however in case you have different delimitation characters you may consider one of the following solutions:
The first solution, read strings line by line. After that it use the find function in order to find the first position o the specific delimiter. It then removes the number read and continues till the delimiter is not found anymore.
You can customize the delimiter by modifying the delimiter variable value.
#include <iostream>
#include <string>
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
string delimiter = " ";
size_t pos = 0;
string token;
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
pos = 0;
while ((pos = line.find(delimiter)) != std::string::npos) {
token = line.substr(0, pos);
std::cout << token << std::endl;
line.erase(0, pos + delimiter.length());
temp.push_back(atoi(token.c_str()));
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}
The second solution make use of regex and it doesn't care about the delimiter use, it will search and match any integers found in the string.
#include <iostream>
#include <string>
#include <regex> // The new library introduced in C++ 11
#include <fstream>
using namespace std;
int main()
{
string line;
ifstream myfile("CorrOut.dat");
std::smatch m;
std::regex e("[-+]?\\d+");
vector<vector<int>> data;
if (!myfile.is_open())
cout << "Unable to open file";
else {
while (getline(myfile, line))
{
vector<int> temp;
while (regex_search(line, m, e)) {
for (auto x : m) {
std::cout << x.str() << " ";
temp.push_back(atoi(x.str().c_str()));
}
std::cout << std::endl;
line = m.suffix().str();
}
data.push_back(temp);
}
myfile.close();
}
return 0;
}
I would like to read an input file in C++, for which the structure (or lack of) would be something like a series of lines with text = number, such as
input1 = 10
input2 = 4
set1 = 1.2
set2 = 1.e3
I want to get the number out of the line, and throw the rest away. Numbers can be either integers or doubles, but I know when they are one or other.
I also would like to read it such as
input1 = 10
input2=4
set1 =1.2
set2= 1.e3
so as to be more robust to the user. I think this means that it shouldn't be red in a formatted fashion.
Anyway, is there a smart way to do that?
I have already tried the following, but with minimal knowledge of what I've been doing, so the result was as expected... no success.
#include <stdio.h>
#include <stdlib.h>
#include <float.h>
#include <math.h>
#include <iostream>
#include <fstream>
#include <iomanip>
#include <cstdlib>
#include <boost/lexical_cast.hpp>
#include <string>
using namespace std;
using namespace boost;
int main(){
string tmp;
char temp[100];
int i,j,k;
ifstream InFile("input.dat");
//strtol
InFile.getline(temp,100);
k=strtol(temp,0,10);
cout << k << endl;
//lexical_cast
InFile.getline(temp,100);
j = lexical_cast<int>(temp);
cout << j << endl;
//Direct read
InFile >> tmp >> i;
cout << i << endl;
return 0;
}
Simply read one line at a time.
Then split each line on the '=' sign. Use the stream functionality do the rest.
#include <sstream>
#include <fstream>
#include <iostream>
#include <string>
int main()
{
std::ifstream data("input.dat");
std::string line;
while(std::getline(data,line))
{
std::stringstream str(line);
std::string text;
std::getline(str,text,'=');
double value;
str >> value;
}
}
With error checking:
#include <sstream>
#include <fstream>
#include <iostream>
#include <string>
int main()
{
std::ifstream data("input.dat");
std::string line;
while(std::getline(data,line))
{
std::stringstream str(line);
std::string text;
double value;
if ((std::getline(str,text,'=')) && (str >> value))
{
// Happy Days..
// Do processing.
continue; // To start next iteration of loop.
}
// If we get here. An error occurred.
// By doing nothing the line will be ignored.
// Maybe just log an error.
}
}
There are already some fine solutions here. However, just to throw it out there, some comments implied that Boost Spirit is an inappropriate solution for this problem. I'm not sure I completely disagree. However, the following solution is very terse, readable (if you know EBNF) and error-tolerant. I'd consider using it.
#include <fstream>
#include <string>
#include <boost/spirit.hpp>
using namespace std;
using namespace boost::spirit;
int main()
{
ifstream data("input.dat");
string line;
vector<double> numbers;
while(getline(data,line))
{
parse(line.c_str(),
*(+~ch_p('=') >> ch_p('=') >> real_p[push_back_a(numbers)]),
space_p);
}
}
Off the top of my head:
vector<double> vals(istream &in) {
vector<double> r;
string line;
while (getline(f, line)) {
const size_t eq = line.find('=');
if (eq != string::npos) {
istringstream ss(line.substr(eq + 1));
double d = 0;
ss >> d;
if (ss) r.push_back(d);
else throw "Line contains no value";
}
else {
throw "Line contains no =";
}
}
return r;
}
int main(int argc, char *argv[]) {
vector<double> vs = vals(ifstream(argv[1]));
}
C FTW (modified to handle doubles)
#include <stdio.h>
int
main ()
{
double num;
while (!feof (stdin))
if (1 == fscanf (stdin, "%*[^=] = %lf", &num))
printf ("%g\n", num);
return 0;
}
now that you are already using boost with lexical_cast, just parse each line with boost::split() and boost::is_any_of() into 1 2-element vector, with token_compress turned on.
the following code illustrates the parse, but skips the numeric conversion, which could be solved easily with boost lexical_cast.
#include <fstream>
#include <sstream>
#include <string>
#include <iostream>
#include <vector>
#include <boost/algorithm/string/split.hpp>
#include <boost/algorithm/string/classification.hpp>
#include <boost/foreach.hpp>
using std::string;
using std::cout;
using std::ifstream;
using std::stringstream;
using std::vector;
std::string file_to_string()
{
ifstream data("data.txt");
stringstream s;
s << data.rdbuf();
return s.str();
}
void print_parameter(vector<string>& v)
{
cout << v_para[0];
cout << "=";
cout << v_para[1];
cout << std::endl;
}
vector<string> string_to_lines(const string& s)
{
return v_lines;
}
int main()
{
vector<string> v_lines;
boost::split(v_lines, file_to_string(), boost::is_any_of("\n"), boost::token_compress_on);
vector<string> v_para;
BOOST_FOREACH(string& line, v_lines)
{
if(line.empty()) continue;
boost::split(v_para, line, boost::is_any_of(" ="), boost::token_compress_on);
// test it
print_parameter(v_para);
}
}
If you are devising this format, I would suggest adopting the INI file format.
The lightweight syntaxed INI format includes sections (allows you to have a little more structure in the format) which may or may not be desirable in your case:
I.e.
[section_1]
variable_1=value1
variable_2=999
[sectionA]
variable_A=value A
variable_B=111
The external links on this wikipedia page list a number of libraries that can be used for working with these types of files that extend/replace the basic GetPrivateProfileString functions from the Windows API and support other platforms.
Most of these would handle the space padded = sign (or at least before the = since a space after the = may be intentional/significant.
Some of these libraries might also have an option to omit [sections] if you don't want that (my own C++ class for handling INI like format files has this option).
The advantage to these libraries and/or using the Windows API GetPrivateProfileXXX functions is that your program can access specific variables
(I.e. get or set the value for variable_A from sectionA) without your program having to
write/scan/rewrite the entire file.
Here's my quickest STL solution:
#include <fstream>
#include <list>
#include <locale>
void foo()
{
std::fstream f("c:\\temp\\foo.txt", std::ios_base::in);
std::list<double> numbers;
while (!f.eof())
{
int c = f.get();
if (std::isdigit(c, std::locale::classic()) ||
c == '+' ||
c == '-' ||
c == '.')
{
f.putback(c);
double val;
f >> val;
if (f.fail()) {
f.clear(f.eof() ? std::ios_base::eofbit : std::ios_base::goodbit);
continue;
}
else
{
numbers.push_back(val);
}
}
}
}
Just tested this... it works, and doesn't require anything outside of the C++ standard library.
#include <iostream>
#include <map>
#include <string>
#include <algorithm>
#include <iterator>
#include <cctype>
#include <sstream>
using namespace std; // just because this is an example...
static void print(const pair<string, double> &p)
{
cout << p.first << " = " << p.second << "\n";
}
static double to_double(const string &s)
{
double value = 0;
istringstream is(s);
is >> value;
return value;
}
static string trim(const string &s)
{
size_t b = 0;
size_t e = s.size();
while (b < e && isspace(s[b])) ++b;
while (e > b && isspace(s[e-1])) --e;
return s.substr(b, e - b);
}
static void readINI(istream &is, map<string, double> &values)
{
string key;
string value;
while (getline(is, key, '='))
{
getline(is, value, '\n');
values.insert(make_pair(trim(key), to_double(value)));
}
}
int main()
{
map<string, double> values;
readINI(cin, values);
for_each(values.begin(), values.end(), print);
return 0;
}
EDIT: I just read the original question and noticed I'm not producing an exact answer. If you don't care about the key names, juts discard them. Also, why do you need to identify the difference between integer values and floating-point values? Is 1000 an integer or a float? What about 1e3 or 1000.0? It's easy enough to check if a given floating-point value is integral, but there is a clas of numbers that are both valid integers and valid floating-point values, and you need to get into your own parsing routines if you want to deal with that correctly.