How to get the line number from a file in C++? - c++

What would be the best way to get the line number of the current line in a file that I have opened with a ifstream? So I am reading in the data and I need to store the line number that it is on so that I can display it later if the data doesn't match the specifications.

If you don't want to limit yourself to std::getline, then you could use class derived from std::streambuf, and which keeps track of the current line number:
class CountingStreamBuffer : public std::streambuf { /* see below */ };
// open file
std::ifstream file("somefile.txt");
// "pipe" through counting stream buffer
CountingStreamBuffer cntstreambuf(file.rdbuf());
std::istream is(&cntstreambuf);
// sample usage
is >> x >> y >> z;
cout << "At line " << cntstreambuf.lineNumber();
std::getline(is, str);
cout << "At line " << cntstreambuf.lineNumber();
Here is a sample implementation of CountingStreamBuffer:
#include <streambuf>
class CountingStreamBuffer : public std::streambuf
{
public:
// constructor
CountingStreamBuffer(std::streambuf* sbuf) :
streamBuf_(sbuf),
lineNumber_(1),
lastLineNumber_(1),
column_(0),
prevColumn_(static_cast<unsigned int>(-1)),
filePos_(0)
{
}
// Get current line number
unsigned int lineNumber() const { return lineNumber_; }
// Get line number of previously read character
unsigned int prevLineNumber() const { return lastLineNumber_; }
// Get current column
unsigned int column() const { return column_; }
// Get file position
std::streamsize filepos() const { return filePos_; }
protected:
CountingStreamBuffer(const CountingStreamBuffer&);
CountingStreamBuffer& operator=(const CountingStreamBuffer&);
// extract next character from stream w/o advancing read pos
std::streambuf::int_type underflow()
{
return streamBuf_->sgetc();
}
// extract next character from stream
std::streambuf::int_type uflow()
{
int_type rc = streamBuf_->sbumpc();
lastLineNumber_ = lineNumber_;
if (traits_type::eq_int_type(rc, traits_type::to_int_type('\n')))
{
++lineNumber_;
prevColumn_ = column_ + 1;
column_ = static_cast<unsigned int>(-1);
}
++column_;
++filePos_;
return rc;
}
// put back last character
std::streambuf::int_type pbackfail(std::streambuf::int_type c)
{
if (traits_type::eq_int_type(c, traits_type::to_int_type('\n')))
{
--lineNumber_;
lastLineNumber_ = lineNumber_;
column_ = prevColumn_;
prevColumn_ = 0;
}
--column_;
--filePos_;
if (c != traits_type::eof())
return streamBuf_->sputbackc(traits_type::to_char_type(c));
else
return streamBuf_->sungetc();
}
// change position by offset, according to way and mode
virtual std::ios::pos_type seekoff(std::ios::off_type pos,
std::ios_base::seekdir dir,
std::ios_base::openmode mode)
{
if (dir == std::ios_base::beg
&& pos == static_cast<std::ios::off_type>(0))
{
lastLineNumber_ = 1;
lineNumber_ = 1;
column_ = 0;
prevColumn_ = static_cast<unsigned int>(-1);
filePos_ = 0;
return streamBuf_->pubseekoff(pos, dir, mode);
}
else
return std::streambuf::seekoff(pos, dir, mode);
}
// change to specified position, according to mode
virtual std::ios::pos_type seekpos(std::ios::pos_type pos,
std::ios_base::openmode mode)
{
if (pos == static_cast<std::ios::pos_type>(0))
{
lastLineNumber_ = 1;
lineNumber_ = 1;
column_ = 0;
prevColumn_ = static_cast<unsigned int>(-1);
filePos_ = 0;
return streamBuf_->pubseekpos(pos, mode);
}
else
return std::streambuf::seekpos(pos, mode);
}
private:
std::streambuf* streamBuf_; // hosted streambuffer
unsigned int lineNumber_; // current line number
unsigned int lastLineNumber_;// line number of last read character
unsigned int column_; // current column
unsigned int prevColumn_; // previous column
std::streamsize filePos_; // file position
};

From an ifstream point of view there is no line number. If you read in the file line by line, then you just have to keep track of it yourself.

Use std::getline to read each line in one by one. Keep an integer indicating the number of lines you have read: initialize it to zero and each time you call std::getline and it succeeds, increment it.

An inefficient but dead simple way is to have a function that given a stream, it counts the new line characters from the beginning of the stream to the current position.
int getCurrentLine(std::istream& is)
{
int lineCount = 1;
is.clear(); // need to clear error bits otherwise tellg returns -1.
auto originalPos = is.tellg();
if (originalPos < 0)
return -1;
is.seekg(0);
char c;
while ((is.tellg() < originalPos) && is.get(c))
{
if (c == '\n') ++lineCount;
}
return lineCount;
}
In some code I am working on, I am only interested to know the line number if invalid input is encountered, in which case import is aborted immediately. Since the function is called only once the inefficiency is not really a problem.
The following is a full example:
#include <iostream>
#include <sstream>
int getCurrentLine(std::istream& is)
{
int lineCount = 1;
is.clear(); // need to clear error bits otherwise tellg returns -1.
auto originalPos = is.tellg();
if (originalPos < 0)
return -1;
is.seekg(0);
char c;
while ((is.tellg() < originalPos) && is.get(c))
{
if (c == '\n') ++lineCount;
}
return lineCount;
}
void ReadDataFromStream(std::istream& s)
{
double x, y, z;
while (!s.fail() && !s.eof())
{
s >> x >> y >> z;
if (!s.fail())
std::cout << x << "," << y << "," << z << "\n";
}
if (s.fail())
std::cout << "Error at line: " << getCurrentLine(s) << "\n";
else
std::cout << "Read until line: " << getCurrentLine(s) << "\n";
}
int main(int argc, char* argv[])
{
std::stringstream s;
s << "0.0 0.0 0.0\n";
s << "1.0 ??? 0.0\n";
s << "0.0 1.0 0.0\n";
ReadDataFromStream(s);
std::stringstream s2;
s2 << "0.0 0.0 0.0\n";
s2 << "1.0 0.0 0.0\n";
s2 << "0.0 1.0 0.0";
ReadDataFromStream(s2);
return 0;
}

Related

Improving code performance by loading binary data instead of text and converting

Hi I am working with existing C++ code, I normally use VB.NET and much of what I am seeing is confusing and contradictory to me.
The existing code loads neural network weights from a file that is encoded as follows:
2
model.0.conv.conv.weight 5 3e17c000 3e9be000 3e844000 bc2f8000 3d676000
model.0.conv.bn.weight 7 4006a000 3f664000 3fc98000 3fa6a000 3ff2e000 3f5dc000 3fc94000
The first line gives the number of subsequent lines. Each of these lines has a description, a number representing how many values follow, then the weight values in hex. In the real file there are hundreds of rows and each row might have hundreds of thousands of weights. The weight file is 400MB in size. The values are converted to floats for use in the NN.
It takes over 3 minutes to decode this file. I am hoping to improve performance by eliminating the conversion from hex encoding to binary and just store the values natively as floats. The problem is I cant understand what the code is doing, nor how I should be storing the values in binary. The relevant section that decodes the rows is here:
while (count--)
{
Weights wt{ DataType::kFLOAT, nullptr, 0 };
uint32_t size;
// Read name and type of blob
std::string name;
input >> name >> std::dec >> size;
wt.type = DataType::kFLOAT;
// Load blob
uint32_t* val = reinterpret_cast<uint32_t*>(malloc(sizeof(val) * size));
for (uint32_t x = 0, y = size; x < y; ++x)
{
input >> std::hex >> val[x];
}
wt.values = val;
wt.count = size;
weightMap[name] = wt;
}
The Weights class is described here. DataType::kFLOAT is a 32bit float.
I was hoping to add a line(s) in the inner loop below input >> std::hex >> val[x]; so that I could write the float values to a binary file as the values are converted from hex, but I dont understand what is going on. It looks like memory is being assigned to hold the values but sizeof(val) is 8 bytes and uint32_t are 4 bytes. Furthermore it looks like the values are being stored in wt.values from val but val contains integers not floats. I really dont see what the intent is here.
Could I please get some advice on how to store and load binary values to eliminate the hex conversion. Any advice would be appreciated. A lot.
Here's an example program that will convert the text format shown into a binary format and back again. I took the data from the question and converted to binary and back successfully. My feeling is it's better to cook the data with a separate program before consuming it with the actual application so the app reading code is single purpose.
There's also an example of how to read the binary file into the Weights class at the end. I don't use TensorRT so I copied the two classes used from the documentation so the example compiles. Make sure you don't add those to your actual code.
If you have any questions let me know. Hope this helps and makes loading faster.
#include <fstream>
#include <iostream>
#include <unordered_map>
#include <vector>
void usage()
{
std::cerr << "Usage: convert <operation> <input file> <output file>\n";
std::cerr << "\tconvert b in.txt out.bin - Convert text to binary\n";
std::cerr << "\tconvert t in.bin out.txt - Convert binary to text\n";
}
bool text_to_binary(const char *infilename, const char *outfilename)
{
std::ifstream in(infilename);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
std::ofstream out(outfilename, std::ios::binary);
if (!out)
{
std::cerr << "Error: Could not open output file '" << outfilename << "'\n";
return false;
}
uint32_t line_count;
if (!(in >> line_count))
{
return false;
}
if (!out.write(reinterpret_cast<const char *>(&line_count), sizeof(line_count)))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
std::string name;
uint32_t num_values;
if (!(in >> name >> std::dec >> num_values))
{
return false;
}
std::vector<uint32_t> values(num_values);
for (uint32_t i = 0; i < num_values; ++i)
{
if (!(in >> std::hex >> values[i]))
{
return false;
}
}
uint32_t name_size = static_cast<uint32_t>(name.size());
bool result = out.write(reinterpret_cast<const char *>(&name_size), sizeof(name_size)) &&
out.write(name.data(), name.size()) &&
out.write(reinterpret_cast<const char *>(&num_values), sizeof(num_values)) &&
out.write(reinterpret_cast<const char *>(values.data()), values.size() * sizeof(values[0]));
if (!result)
{
return false;
}
}
return true;
}
bool binary_to_text(const char *infilename, const char *outfilename)
{
std::ifstream in(infilename, std::ios::binary);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
std::ofstream out(outfilename);
if (!out)
{
std::cerr << "Error: Could not open output file '" << outfilename << "'\n";
return false;
}
uint32_t line_count;
if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
{
return false;
}
if (!(out << line_count << "\n"))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
uint32_t name_size;
if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
{
return false;
}
std::string name(name_size, 0);
if (!in.read(name.data(), name_size))
{
return false;
}
uint32_t num_values;
if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
{
return false;
}
std::vector<float> values(num_values);
if (!in.read(reinterpret_cast<char *>(values.data()), num_values * sizeof(values[0])))
{
return false;
}
if (!(out << name << " " << std::dec << num_values))
{
return false;
}
for (float &f : values)
{
uint32_t i;
memcpy(&i, &f, sizeof(i));
if (!(out << " " << std::hex << i))
{
return false;
}
}
if (!(out << "\n"))
{
return false;
}
}
return true;
}
int main(int argc, const char *argv[])
{
if (argc != 4)
{
usage();
return EXIT_FAILURE;
}
char op = argv[1][0];
bool result = false;
switch (op)
{
case 'b':
case 'B':
result = text_to_binary(argv[2], argv[3]);
break;
case 't':
case 'T':
result = binary_to_text(argv[2], argv[3]);
break;
default:
usage();
break;
}
return result ? EXIT_SUCCESS : EXIT_FAILURE;
}
// Possible implementation of the code snippet in the original question to read the weights
// START Copied from TensorRT documentation - Do not include in your code
enum class DataType : int32_t
{
kFLOAT = 0,
kHALF = 1,
kINT8 = 2,
kINT32 = 3,
kBOOL = 4
};
class Weights
{
public:
DataType type;
const void *values;
int64_t count;
};
// END Copied from TensorRT documentation - Do not include in your code
bool read_weights(const char *infilename)
{
std::unordered_map<std::string, Weights> weightMap;
std::ifstream in(infilename, std::ios::binary);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
uint32_t line_count;
if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
uint32_t name_size;
if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
{
return false;
}
std::string name(name_size, 0);
if (!in.read(name.data(), name_size))
{
return false;
}
uint32_t num_values;
if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
{
return false;
}
// Normally I would use float* values = new float[num_values]; here which
// requires delete [] ptr; to free the memory later.
// I used malloc to match the original example since I don't know who is
// responsible to clean things up later, and TensorRT might use free(ptr)
// Makes no real difference as long as new/delete ro malloc/free are matched up.
float *values = reinterpret_cast<float *>(malloc(num_values * sizeof(*values)));
if (!in.read(reinterpret_cast<char *>(values), num_values * sizeof(*values)))
{
return false;
}
weightMap[name] = Weights { DataType::kFLOAT, values, num_values };
}
return true;
}

C++ decoding LZ77-compressed data using std::fstream too slow

I have a function in my code which decodes a file compressed using the LZ77 algorithm. But on 15 MB input file decompression takes about 3 minutes (too slow). What's the reason of poor performance? On every step of the loop I read two or three bytes and get length, offset and next character. If offset is not zero I also have to move "offset" bytes back in output stream and read "length" bytes. Then I insert them to the end of the same stream before writing next character there.
void uncompressData(long block_size, unsigned char* data, fstream &file_out)
{
unsigned char* append;
append = new unsigned char[buf_length];
link myLink;
long cur_position = 0;
file_out.seekg(0, ios::beg);
cout << file_out.tellg() << endl;
int i=0;
myLink.length=-1;
while(i<(block_size-1))
{
if(myLink.length!=-1) file_out << myLink.next;
myLink.length = (short)(data[i] >> 4);
//cout << myLink.length << endl;
if(myLink.length!=0)
{
myLink.offset = (short)(data[i] & 0xF);
myLink.offset = myLink.offset << 8;
myLink.offset = myLink.offset | (short)data[i+1];
myLink.next = (unsigned char)data[i+2];
cur_position=file_out.tellg();
file_out.seekg(-myLink.offset,ios_base::cur);
if(myLink.length<=myLink.offset)
{
file_out.read((char*)append, myLink.length);
}
else
{
file_out.read((char*)append, myLink.offset);
int k=myLink.offset,j=0;
while(k<myLink.length)
{
append[k]=append[j];
j++;
if(j==myLink.offset) j=0;
k++;
}
}
file_out.seekg(cur_position);
file_out.write((char*)append, myLink.length);
i++;
}
else {
myLink.offset = 0;
myLink.next = (unsigned char)data[i+1];
}
i=i+2;
}
unsigned char hasOddSymbol = data[block_size-1];
if(hasOddSymbol==0x0) { file_out << myLink.next; }
delete[] append;
}
You could try doing it on a std::stringstream in memory instead:
#include <sstream>
void uncompressData(long block_size, unsigned char* data, fstream& out)
{
std::stringstream file_out; // first line in the function
// the rest of your function goes here
out << file_out.rdbuf(); // last line in the function
}

Why am i getting blank output after writing this filehandling code in c++?

I have made a tester class where I take questions from a question pool text file and put random questions from there to a docx file. I want to know why my code is giving me blank output in the docx file.
my random function is working fine. I am selecting two two questions from three questions file.
Here is my code - `
void test()
{
string line;
fstream question1("questiondesc.txt",ios::in | ios::out | ios::app);
fstream testgen("GeneratedTest.docx",ios::trunc | ios::in | ios::out);
testgen.open("GeneratedTest.docx");
if(!question1.is_open())
{
question1.open("questiondesc.txt");
}
int i,num;
for (i = 0; i < 2; i++) {
num = random(1,12);
for(int i =1;i<=num;i++)
{
getline(question1,line);
}
question1.clear();
question1.seekg(0, ios::beg);
testgen<<line<<endl;
}
question1.close();
ifstream question2("questionmcq.txt");
if(!question2.is_open())
{
question2.open("questionmcq.txt");
}
for (i = 0; i < 2; i++) {
num = random(1,26);
while(num%2==0)
{
num = random(1,26);
}
for(int i =1;i<=num;i++)
{
getline(question2,line);
}
testgen<<line<<endl;
getline(question2,line);
testgen<<line<<endl;
question2.clear();
question2.seekg(0, ios::beg);
}
question2.close();
ifstream question3("questionanalytical.txt");
if(!question3.is_open())
{
question3.open("questionanalytical.txt");
}
for (i = 0; i < 2; i++) {
num = random(1,12);
for(int i =1;i<=num;i++)
{
getline(question3,line);
}
question3.clear();
question3.seekg(0, ios::beg);
testgen<<line<<endl;
}
question3.close();
testgen.close();
}
There are errors in your code. I will show them as a comment in the below listing. Additionally I will show (onw of many, and maybe not the best ) solutions for your problem.
You should break down your problem into smaller pieces and design more functions. Then, life will be easier.
Additionally. You´should write comments. If you write comments, then you will detect the problems by yourself.
Your code with my remarks:
#include <iostream>
#include <fstream>
#include <string>
#include <random>
using namespace std; // NO NEVER USE
int random(int from, int to) {
std::random_device random_device;
std::mt19937 generator(random_device());
std::uniform_int_distribution<int> distribution(from, to);
return distribution(generator);
}
void test()
{
string line; // Line is not initialized an not needed here. Pollutes namespace
fstream question1("questiondesc.txt", ios::in | ios::out | ios::app); // Opening a file with these flags will fail. Use ifstream
fstream testgen("GeneratedTest.docx", ios::trunc | ios::in | ios::out);// Opening a file with these flags will fail. Use ofstream
testgen.open("GeneratedTest.docx"); // File was alread opened and failed. Reopening will not work. It failed alread
if (!question1.is_open()) // Use if "(!question1)" instead. There could be also other error bits
{ // Always check the status of any IO operation
question1.open("questiondesc.txt"); // Will never work. Failer already
}
int i, num; // Variable not initialized and not needed here. Name space pollution
for (i = 0; i < 2; i++) {
num = random(1, 12); // This function was not defined. I redefined it
for (int i = 1; i <= num; i++) // i=1 and i<= reaaly) not i=0 and i<num?
{
getline(question1, line); // Always check status of any IO function
}
question1.clear();
question1.seekg(0, ios::beg);
testgen << line << endl;
}
question1.close(); // The destructor of the fstream will close the file for you
ifstream question2("questionmcq.txt"); // Now you open the file as ifstream
if (!question2.is_open()) // Do check for all possible flags.: If (!question2)
{
question2.open("questionmcq.txt"); // Will not work, if it failed in the first time
}
for (i = 0; i < 2; i++) { // So 2 times
num = random(1, 26);
while (num % 2 == 0) // If numbers are equal
{
num = random(1, 26); // Get an odd number
}
for (int i = 1; i <= num; i++) // Usually from 0 to <num
{
getline(question2, line);
}
testgen << line << endl;
getline(question2, line);
testgen << line << endl;
question2.clear();
question2.seekg(0, ios::beg);
}
question2.close(); // No need to close. Destructor will do it for you
ifstream question3("questionanalytical.txt"); // Now you open the file as ifstream
if (!question3.is_open()) // Wrong check. Check for all flags
{
question3.open("questionanalytical.txt"); // Will not help in case of failure
}
// Now this is the 3rd time with the same code. So, put it into a function
for (i = 0; i < 2; i++) {
num = random(1, 12);
for (int i = 1; i <= num; i++)
{
getline(question3, line);
}
question3.clear();
question3.seekg(0, ios::beg);
testgen << line << endl;
}
question3.close();
testgen.close();
}
int main() {
test();
return 0;
}
And here one possible solution. With functions to handler similar parts of the code:
#include <iostream>
#include <string>
#include <fstream>
#include <random>
#include <vector>
#include <tuple>
// From the internet: https://en.cppreference.com/w/cpp/numeric/random/random_device
int random(int from, int to) {
std::random_device random_device;
std::mt19937 generator(random_device());
std::uniform_int_distribution<int> distribution(from, to);
return distribution(generator);
}
std::string readNthLineFromFile(std::ifstream& ifs, int n) {
// Reset file to the beginning
ifs.clear();
ifs.seekg(0, std::ios::beg);
// Default return string in case of error
std::string result{ "\n*** Error while reading a line from the source file\n" };
// If getline fails or ifs is in fail state, the string will be default
for (; std::getline(ifs, result) && (n != 0); n--);
// Give back the desired line
return result;
}
void generateQuestion(std::ifstream& sourceFileStream, std::ofstream& destinationFileStream, int n, const bool twoLines = false) {
// We want to prevent readin the same question again
int oldLineNumber = 0;
// For whatever reason, do this 2 times.
for (size_t i = 0U; i < 2; ++i) {
// If we want to read 2 consecutive lines, then we should not come up with the last kine in the file
if (twoLines & (n > 1)) --n;
// Get a random line number. But no duplicates in the 2 loops
int lineNumber{};
do {
lineNumber = random(1, n);
} while (lineNumber == oldLineNumber);
// For the next loop execution
oldLineNumber = lineNumber;
// Read the random line
std::string line{ readNthLineFromFile(sourceFileStream, lineNumber) };
// And write it to the destination file
destinationFileStream << line << "\n";
// If we want to read to lines in a row
if (twoLines) {
// Read next line
line = readNthLineFromFile(sourceFileStream, ++lineNumber);
// And write it to the destination file
destinationFileStream << line << "\n";
}
}
}
int main() {
const std::string destinationFilename{ "generatedTest.txt" };
const std::string questions1Filename{ "questiondesc.txt" };
const std::string questions2Filename{ "questionmcq.txt" };
const std::string questions3Filename{ "questionanalytical.txt" };
// Here we store the filenames and if one or 2 lines shall be read
std::vector<std::tuple<const std::string, const size_t, const bool>> source{
{ questions1Filename, 12U, false },
{ questions2Filename, 26U, true },
{ questions3Filename, 12U, false }
};
// Open the destination file and check, if it could be opened
if (std::ofstream destinationFileStream(destinationFilename); destinationFileStream) {
// Now open the first source file and generate the questions
for (const std::tuple<const std::string, const size_t, const bool>& t : source) {
// Open source file and check, if it could be opened
if (std::ifstream sourceFileStream(std::get<0>(t)); sourceFileStream) {
generateQuestion(sourceFileStream, destinationFileStream, std::get<1>(t), std::get<2>(t));
}
else {
std::cerr << "\n*** Error. Could not open source file '" << std::get<0>(t) << "'\n";
}
}
}
else {
std::cerr << "\n*** Error: Could not open destination file '" << destinationFilename << "'\n";
}
return 0;
}

Analog scanf("%1d") for C++ (std::cin)

I'm looking for some analog scanf("%1d", &sequence) for std::cin >> sequence.
For example:
for ( ; scanf("%1d", &sequence) == 1; ) {
printf("%d ", sequence);
}
stdin: 5341235
stdout: 5 3 4 1 2 3 5
How does it work in C++ ?!
for ( ; std::cin >> *some_magic* sequence; ) {
std::cout << sequence << " ";
}
you can do this if you want (the sequence variable must be of type char)
for ( ; std::cin.read(&sequence,1); ) {
sequence-='0';
std::cout << sequence << " ";;
}
With respect to input parsing there are a number of features unfortunately missing from IOStreams which are present for scanf(). Setting a field width for numeric types is one of them (another one is matching strings in inputs). Assuming you want to stay with formatted input, one way to deal with it is to create a filtering stream buffer which injects a space character after a given number of characters.
Another approach consists of writing a custom std::num_get<char> facet, to imbue() it into the current stream, and then just set up width. Instead of injecting spaces the actual character parsing would observe if either the end of the stream is reached or the number of allowed characters is exceeded. The corresponding code to use this facet would set up a custom std::locale but otherwise look like one would expect:
int main() {
std::istringstream in("1234567890123456789");
std::locale loc(std::locale(), new width_num_get);
in.imbue(loc);
int w(0);
for (int value(0); in >> std::setw(++w) >> value; ) {
std::cout << "value=" << value << "\n";
}
}
Here is a somewhat naive implementation of a corresponding std::num_get<char> facet which just collects the appropriate digits (assuming base 10) and then just calls std::stoi() to get the value converted. It can be done more flexible and more efficient but you get the picture:
#include <iostream>
#include <streambuf>
#include <sstream>
#include <locale>
#include <string>
#include <iomanip>
#include <cctype>
struct width_num_get
: std::num_get<char> {
auto do_get(iter_type it, iter_type end, std::ios_base& fmt,
std::ios_base::iostate& err, long& value) const
-> iter_type override {
int width(fmt.width(0)), count(0);
if (width == 0) {
width = -1;
}
std::string digits;
if (it != end && (*it == '-' || *it == '+')) {
digits.push_back(*it++);
++count;
}
while (it != end && count != width && std::isdigit(static_cast<unsigned char>(*it))) {
digits.push_back(*it);
++it;
++count;
}
try { value = std::stol(digits); }
catch (...) { err |= std::ios_base::failbit; } // should probably distinguish overflow
return it;
}
};
The first described approach could use code like this for reading integers with increasing width (I'm using different width to show that it can flexibly be set):
int main() {
std::istringstream in("1234567890123456789");
int w(0);
for (int value(0); in >> fw(++w) >> value; ) {
std::cout << "value=" << value << "\n";
}
}
Of course, the entire magic is in the little fw() which is a custom manipulator: it installs a filtering stream buffer if the currently used stream buffer isn't of the appropriate type and set the number for characters after which the a space should be injected. The filtering stream buffer reads individual characters and simply injects a space after the corresponding number of characters. The code could be something like this (which currently doesn't do clean-up once the stream is done - I'll add that next):
#include <iostream>
#include <streambuf>
#include <sstream>
class fieldbuf
: public std::streambuf {
std::streambuf* sbuf;
int width;
char buffer[1];
int underflow() {
if (this->width == 0) {
buffer[0] = ' ';
this->width = -1;
}
else {
int c = this->sbuf->snextc();
if (c == std::char_traits<char>::eof()) {
return c;
}
buffer[0] = std::char_traits<char>::to_char_type(c);
if (0 < this->width) {
--this->width;
}
}
this->setg(buffer, buffer, buffer + 1);
return std::char_traits<char>::to_int_type(buffer[0]);
}
public:
fieldbuf(std::streambuf* sbuf): sbuf(sbuf), width(-1) {}
void setwidth(int width) { this->width = width; }
};
struct fw {
int width;
fw(int width): width(width) {}
};
std::istream& operator>> (std::istream& in, fw const& width) {
fieldbuf* fbuf(dynamic_cast<fieldbuf*>(in.rdbuf()));
if (!fbuf) {
fbuf = new fieldbuf(in.rdbuf());
in.rdbuf(fbuf);
static int index = std::ios_base::xalloc();
in.pword(index) = fbuf;
in.register_callback([](std::ios_base::event ev, std::ios_base& stream, int index){
if (ev == std::ios_base::copyfmt_event) {
stream.pword(index) = 0;
}
else if (ev == std::ios_base::erase_event) {
delete static_cast<fieldbuf*>(stream.pword(index));
stream.pword(index) = 0;
}
}, index);
}
fbuf->setwidth(width.width);
return in;
}

C++: get number of characters printed when using ofstream

The C fprintf() function returns the number of characters printed. Is there similar functionality in C++ when writing to a file with ofstream? I am interested in a solution that is compatible with C++03 if possible.
For example:
ofstream file("outputFile");
file << "hello";
// Here would I like to know that five characters were printed.
file << " world";
// Here I would like to know that six characters were printed.
What you're looking for is tellp().
You could use it like so:
ofstream file("outputFile");
auto pos1 = file.tellp();
file << "hello";
auto pos2 = file.tellp();
std::cout << pos2 - pos1 << std::endl;
Seek operations are rather expensive (primarily because they need to prepare streams to potentially switch between reading and writing). I'd personally rather use a filtering stream buffer which provides the counts, e.g.:
class countbuf: public std::streambuf {
std::streambuf* sbuf;
std::size_t count;
char buffer[256];
int overflow(int c) {
if (c != std::char_traits<char>::eof()) {
*this->pptr() = c;
this->pbump(1);
}
return this->sync() == -1
? std::char_traits<char>::eof()
: std::char_traits<char>::not_eof(c);
}
int sync() {
std::size_t size(this->pptr() - this->pbase());
this->count += size;
this->setp(this->buffer, this->buffer + 256);
return size == this->sbuf->sputn(this->pbase(), this->pptr() - this->pbase())
? this->sbuf->pubsync(): -1;
}
public:
countbuf(std::streambuf* sbuf): sbuf(sbuf), count() {
this->setp(buffer, buffer + 256);
}
std::size_t count() const { return count + this->pptr() - this->pbase(); }
std::size_t reset() const {
std::size_t rc(this->count());
this->sync();
this->count = 0;
return rc;
}
};
Once you got this stream buffer, you could just install into an std::ostream (and possibly package the construction into a custom stream class):
countbuf sbuf(std::cout.rdbuf()); // can't seek on this stream anyway...
std::ostream out(&sbuf);
out << "hello!\n" << std::flush;
std::cout << "count=" << out.reset() << '\n';