C++ decoding LZ77-compressed data using std::fstream too slow - c++

I have a function in my code which decodes a file compressed using the LZ77 algorithm. But on 15 MB input file decompression takes about 3 minutes (too slow). What's the reason of poor performance? On every step of the loop I read two or three bytes and get length, offset and next character. If offset is not zero I also have to move "offset" bytes back in output stream and read "length" bytes. Then I insert them to the end of the same stream before writing next character there.
void uncompressData(long block_size, unsigned char* data, fstream &file_out)
{
unsigned char* append;
append = new unsigned char[buf_length];
link myLink;
long cur_position = 0;
file_out.seekg(0, ios::beg);
cout << file_out.tellg() << endl;
int i=0;
myLink.length=-1;
while(i<(block_size-1))
{
if(myLink.length!=-1) file_out << myLink.next;
myLink.length = (short)(data[i] >> 4);
//cout << myLink.length << endl;
if(myLink.length!=0)
{
myLink.offset = (short)(data[i] & 0xF);
myLink.offset = myLink.offset << 8;
myLink.offset = myLink.offset | (short)data[i+1];
myLink.next = (unsigned char)data[i+2];
cur_position=file_out.tellg();
file_out.seekg(-myLink.offset,ios_base::cur);
if(myLink.length<=myLink.offset)
{
file_out.read((char*)append, myLink.length);
}
else
{
file_out.read((char*)append, myLink.offset);
int k=myLink.offset,j=0;
while(k<myLink.length)
{
append[k]=append[j];
j++;
if(j==myLink.offset) j=0;
k++;
}
}
file_out.seekg(cur_position);
file_out.write((char*)append, myLink.length);
i++;
}
else {
myLink.offset = 0;
myLink.next = (unsigned char)data[i+1];
}
i=i+2;
}
unsigned char hasOddSymbol = data[block_size-1];
if(hasOddSymbol==0x0) { file_out << myLink.next; }
delete[] append;
}

You could try doing it on a std::stringstream in memory instead:
#include <sstream>
void uncompressData(long block_size, unsigned char* data, fstream& out)
{
std::stringstream file_out; // first line in the function
// the rest of your function goes here
out << file_out.rdbuf(); // last line in the function
}

Related

Improving code performance by loading binary data instead of text and converting

Hi I am working with existing C++ code, I normally use VB.NET and much of what I am seeing is confusing and contradictory to me.
The existing code loads neural network weights from a file that is encoded as follows:
2
model.0.conv.conv.weight 5 3e17c000 3e9be000 3e844000 bc2f8000 3d676000
model.0.conv.bn.weight 7 4006a000 3f664000 3fc98000 3fa6a000 3ff2e000 3f5dc000 3fc94000
The first line gives the number of subsequent lines. Each of these lines has a description, a number representing how many values follow, then the weight values in hex. In the real file there are hundreds of rows and each row might have hundreds of thousands of weights. The weight file is 400MB in size. The values are converted to floats for use in the NN.
It takes over 3 minutes to decode this file. I am hoping to improve performance by eliminating the conversion from hex encoding to binary and just store the values natively as floats. The problem is I cant understand what the code is doing, nor how I should be storing the values in binary. The relevant section that decodes the rows is here:
while (count--)
{
Weights wt{ DataType::kFLOAT, nullptr, 0 };
uint32_t size;
// Read name and type of blob
std::string name;
input >> name >> std::dec >> size;
wt.type = DataType::kFLOAT;
// Load blob
uint32_t* val = reinterpret_cast<uint32_t*>(malloc(sizeof(val) * size));
for (uint32_t x = 0, y = size; x < y; ++x)
{
input >> std::hex >> val[x];
}
wt.values = val;
wt.count = size;
weightMap[name] = wt;
}
The Weights class is described here. DataType::kFLOAT is a 32bit float.
I was hoping to add a line(s) in the inner loop below input >> std::hex >> val[x]; so that I could write the float values to a binary file as the values are converted from hex, but I dont understand what is going on. It looks like memory is being assigned to hold the values but sizeof(val) is 8 bytes and uint32_t are 4 bytes. Furthermore it looks like the values are being stored in wt.values from val but val contains integers not floats. I really dont see what the intent is here.
Could I please get some advice on how to store and load binary values to eliminate the hex conversion. Any advice would be appreciated. A lot.
Here's an example program that will convert the text format shown into a binary format and back again. I took the data from the question and converted to binary and back successfully. My feeling is it's better to cook the data with a separate program before consuming it with the actual application so the app reading code is single purpose.
There's also an example of how to read the binary file into the Weights class at the end. I don't use TensorRT so I copied the two classes used from the documentation so the example compiles. Make sure you don't add those to your actual code.
If you have any questions let me know. Hope this helps and makes loading faster.
#include <fstream>
#include <iostream>
#include <unordered_map>
#include <vector>
void usage()
{
std::cerr << "Usage: convert <operation> <input file> <output file>\n";
std::cerr << "\tconvert b in.txt out.bin - Convert text to binary\n";
std::cerr << "\tconvert t in.bin out.txt - Convert binary to text\n";
}
bool text_to_binary(const char *infilename, const char *outfilename)
{
std::ifstream in(infilename);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
std::ofstream out(outfilename, std::ios::binary);
if (!out)
{
std::cerr << "Error: Could not open output file '" << outfilename << "'\n";
return false;
}
uint32_t line_count;
if (!(in >> line_count))
{
return false;
}
if (!out.write(reinterpret_cast<const char *>(&line_count), sizeof(line_count)))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
std::string name;
uint32_t num_values;
if (!(in >> name >> std::dec >> num_values))
{
return false;
}
std::vector<uint32_t> values(num_values);
for (uint32_t i = 0; i < num_values; ++i)
{
if (!(in >> std::hex >> values[i]))
{
return false;
}
}
uint32_t name_size = static_cast<uint32_t>(name.size());
bool result = out.write(reinterpret_cast<const char *>(&name_size), sizeof(name_size)) &&
out.write(name.data(), name.size()) &&
out.write(reinterpret_cast<const char *>(&num_values), sizeof(num_values)) &&
out.write(reinterpret_cast<const char *>(values.data()), values.size() * sizeof(values[0]));
if (!result)
{
return false;
}
}
return true;
}
bool binary_to_text(const char *infilename, const char *outfilename)
{
std::ifstream in(infilename, std::ios::binary);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
std::ofstream out(outfilename);
if (!out)
{
std::cerr << "Error: Could not open output file '" << outfilename << "'\n";
return false;
}
uint32_t line_count;
if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
{
return false;
}
if (!(out << line_count << "\n"))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
uint32_t name_size;
if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
{
return false;
}
std::string name(name_size, 0);
if (!in.read(name.data(), name_size))
{
return false;
}
uint32_t num_values;
if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
{
return false;
}
std::vector<float> values(num_values);
if (!in.read(reinterpret_cast<char *>(values.data()), num_values * sizeof(values[0])))
{
return false;
}
if (!(out << name << " " << std::dec << num_values))
{
return false;
}
for (float &f : values)
{
uint32_t i;
memcpy(&i, &f, sizeof(i));
if (!(out << " " << std::hex << i))
{
return false;
}
}
if (!(out << "\n"))
{
return false;
}
}
return true;
}
int main(int argc, const char *argv[])
{
if (argc != 4)
{
usage();
return EXIT_FAILURE;
}
char op = argv[1][0];
bool result = false;
switch (op)
{
case 'b':
case 'B':
result = text_to_binary(argv[2], argv[3]);
break;
case 't':
case 'T':
result = binary_to_text(argv[2], argv[3]);
break;
default:
usage();
break;
}
return result ? EXIT_SUCCESS : EXIT_FAILURE;
}
// Possible implementation of the code snippet in the original question to read the weights
// START Copied from TensorRT documentation - Do not include in your code
enum class DataType : int32_t
{
kFLOAT = 0,
kHALF = 1,
kINT8 = 2,
kINT32 = 3,
kBOOL = 4
};
class Weights
{
public:
DataType type;
const void *values;
int64_t count;
};
// END Copied from TensorRT documentation - Do not include in your code
bool read_weights(const char *infilename)
{
std::unordered_map<std::string, Weights> weightMap;
std::ifstream in(infilename, std::ios::binary);
if (!in)
{
std::cerr << "Error: Could not open input file '" << infilename << "'\n";
return false;
}
uint32_t line_count;
if (!in.read(reinterpret_cast<char *>(&line_count), sizeof(line_count)))
{
return false;
}
for (uint32_t l = 0; l < line_count; ++l)
{
uint32_t name_size;
if (!in.read(reinterpret_cast<char *>(&name_size), sizeof(name_size)))
{
return false;
}
std::string name(name_size, 0);
if (!in.read(name.data(), name_size))
{
return false;
}
uint32_t num_values;
if (!in.read(reinterpret_cast<char *>(&num_values), sizeof(num_values)))
{
return false;
}
// Normally I would use float* values = new float[num_values]; here which
// requires delete [] ptr; to free the memory later.
// I used malloc to match the original example since I don't know who is
// responsible to clean things up later, and TensorRT might use free(ptr)
// Makes no real difference as long as new/delete ro malloc/free are matched up.
float *values = reinterpret_cast<float *>(malloc(num_values * sizeof(*values)));
if (!in.read(reinterpret_cast<char *>(values), num_values * sizeof(*values)))
{
return false;
}
weightMap[name] = Weights { DataType::kFLOAT, values, num_values };
}
return true;
}

Read binary file and count specific number c++

Hey everyone I've been looking everywhere for insight on how to do this particular assignment. I saw something similar but it didn't have a clear explanation. I'm trying to read a bin file and count the number of times a specific number appears. I saw examples of this using a .txt file and it seemed very straight forward using getline. I tried to replicate the similar structure but using a binary file.
int main() {
int searching = 3;
int counter = 0;
unsigned char * memblock;
long long int size;
//open bin file
ifstream file;
file.open("threesData.bin", ios:: in | ios::binary | ios::ate);
//read bin file
if (file.is_open()) {
cout << "it opened\n";
size = file.tellg();
memblock = new unsigned char[size];
file.seekg(0, ios::beg);
file.read((char * ) memblock, size);
while (file.read((char * ) memblock, size)) {
for (int i = 0; i < size; i++) {
(int) memblock[i];
if (memblock[i] == searching) {
counter++;
}
}
}
}
file.close();
cout << "The number " << searching << " appears ";
cout << counter << " times!";
return 0;
}
When I run the program it's clear that it opens but it doesn't count the number I'm searching for. What am I doing wrong?
You seem to be thinking this through but here's how I would go about doing it.
Initialize a buffer with a sensible size.
Cast it to integers, so you can do array[size_t] syntax for simpler arithmetic.
Open the stream, and read while the stream is valid.
Convert the number of read bytes to the number of ints you would expect.
Increment the counter for each character you find that is valid.
Code
#include <fstream>
#include <iostream>
bool check_character(int value)
{
return value == 3;
}
int main(void)
{
// choose the size, cast a pointer as an int type, and initialize
// our counter
static constexpr size_t size = 4096;
char* buffer = new char[size];
int* ints = (int*) buffer;
size_t counter = 0;
// create our stream,
std::ifstream stream("file.bin", std::ios_base::binary);
while (stream) {
// keep reading while the stream is valid
stream.read(buffer, size);
auto count = stream.gcount();
// we only want to go to the last valid integer
// if we expect the file to be only integers,
// we could do `assert(count % sizeof(int) == 0);
// otherwise, we may have trailing characters
// if we have trailing characters, we may want to move them
// to the front of the buffer....
auto chars = count / sizeof(int); // floor division
for (size_t i = 0; i < chars; ++i) {
// false == 0, true == 1, so we can just add
// if the value is 3
counter += check_character(ints[i]);
}
}
std::cout << "Counter is: " << counter << std::endl;
delete[] buffer;
return 0;
}
As NeilButterworth points out, you could also use a vector. I don't really like this, but "meh".
#include <fstream>
#include <iostream>
#include <vector>
/* ellipsed lines */
int main(void)
{
/* ellipsed lines */
static constexpr size_t size = 4096;
std::vector<int> ints;
ints.resize(size / sizeof(int));
char* buffer = (char*) ints.data();
/* ellipsed lines */
/* ellipsed lines */
std::cout << "Counter is: " << counter << std::endl;
// no delete[]
return 0;
}

c++ writing and reading objects to binary files

I'm trying to read an array object (Array is a class I've made using read and write functions to read and write from binary files. So far the write functions works but it won't read from the file properly for some reason. This is the write function :
void writeToBinFile(const char* path) const
{
ofstream ofs(path, ios_base::out | ios_base::app | ios_base::binary);
if (ofs.is_open())
{
ostringstream oss;
for (unsigned int i = 0; i < m_size; i++)
{
oss << ' ';
oss << m_data[i];
}
ofs.write(oss.str().c_str(), oss.str().size());
}
}
This is the read function :
void readFromBinFile(const char* path)
{
ifstream ifs(path, ios_base::in | ios_base::binary || ios_base::ate);
if (ifs.is_open())
{
stringstream ss;
int charCount = 0, spaceCount = 0;
ifs.unget();
while (spaceCount != m_size)
{
charCount++;
if (ifs.peek() == ' ')
{
spaceCount++;
}
ifs.unget();
}
ifs.get();
char* ch = new char[sizeof(char) * charCount];
ifs.read(ch, sizeof(char) * charCount);
ss << ch;
delete[] ch;
for (unsigned int i = 0; i < m_size; i++)
{
ss >> m_data[i];
m_elementCount++;
}
}
}
those are the class fields :
T* m_data;
unsigned int m_size;
unsigned int m_elementCount;
I'm using the following code to write and then read (1 execution for reading another for writing):
Array<int> arr3(5);
//arr3[0] = 38;
//arr3[1] = 22;
//arr3[2] = 55;
//arr3[3] = 7;
//arr3[4] = 94;
//arr3.writeToBinFile("binfile.bin");
arr3.readFromBinFile("binfile.bin");
for (unsigned int i = 0; i < arr3.elementCount(); i++)
{
cout << "arr3[" << i << "] = " << arr3[i] << endl;
}
The problem is now at the readFromBinFile function, it get stuck in an infinite loop and peek() returns -1 for some reason and I can't figure why.
Also note I'm writing to the binary file using spaces to make a barrier between each element so I would know to differentiate between objects in the array and also a space at the start of the writing to make a barrier between previous stored binary data in the file to the array binary data.
The major problem, in my mind, is that you write fixed-size binary data in variable-size textual form. It could be so much simpler if you just stick to pure binary form.
Instead of writing to a string stream and then writing that output to the actual file, just write the binary data directly to the file:
ofs.write(reinterpret_cast<char*>(m_data), sizeof(m_data[0]) * m_size);
Then do something similar when reading the data.
For this to work, you of course need to save the number of entries in the array/vector first before writing the actual data.
So the actual write function could be as simple as
void writeToBinFile(const char* path) const
{
ofstream ofs(path, ios_base::out | ios_base::binary);
if (ofs)
{
ofs.write(reinterpret_cast<const char*>(&m_size), sizeof(m_size));
ofs.write(reinterpret_cast<const char*>(&m_data[0]), sizeof(m_data[0]) * m_size);
}
}
And the read function
void readFromBinFile(const char* path)
{
ifstream ifs(path, ios_base::in | ios_base::binary);
if (ifs)
{
// Read the size
ifs.read(reinterpret_cast<char*>(&m_size), sizeof(m_size));
// Read all the data
ifs.read(reinterpret_cast<char*>(&m_data[0]), sizeof(m_data[0]) * m_size);
}
}
Depending on how you define m_data you might need to allocate memory for it before reading the actual data.
Oh, and if you want to append data at the end of the array (but why would you, in the current code you show, you rewrite the whole array anyway) you write the size at the beginning, seek to the end, and then write the new data.

Write a string to file not equal to string that read from it

Phase 1
example 1: I have string text = "01100001" then I want write to file "a"
example 2: I have string text = "0110000101100010" So I want write to file "ab"
NOTE:I solved phase 1 and result of writing is true.
Phase 2
for example 1:
I want read the file and put it to temp.
So temp = "a" and i convert it to "01100001"
for example 2:
I want read the file and put it to temp.
So temp = "ab" and i convert it to "0110000101100010"
Question
in my code i have below input
string text ="00000110101011100010001011111110011011110101100101110101101111010111111110101011"
"00111011000011100011100000100010111110111110111001100001110001110000101001111010"
"00000101";
I did "phase 1" and I opened the file in a hex editor the writing is true.
But after doing "phase 2" temp != text. Why?
My code
#include <iostream>
#include <sstream>
#include <vector>
#include <fstream>
#include <string>
#include <stdlib.h>
using namespace std;
class bitChar{
public:
unsigned char* c;
int shift_count;
string BITS;
bitChar()
{
shift_count = 0;
c = (unsigned char*)calloc(1, sizeof(char));
}
string readByBits(ifstream& inf)
{
string s ="";
while (inf)
{
string strInput;
getline(inf, strInput );
for (int i =0 ; i < strInput.size() ; i++)
{
s += getBits(strInput[i]);
}
}
return s;
}
void setBITS(string X)
{
BITS = X;
}
int insertBits(ofstream& outf)
{
int total = 0 ;
while(BITS.length())
{
if(BITS[0] == '1')
*c |= 1;
*c <<= 1;
++shift_count;
++total;
BITS.erase(0, 1);
if(shift_count == 7 )
{
if(BITS.size()>0)
{
if(BITS[0] == '1')
*c |= 1;
++total;
BITS.erase(0, 1);
}
writeBits(outf);
shift_count = 0;
free(c);
c = (unsigned char*)calloc(1, sizeof(char));
}
}
if(shift_count > 0)
{
*c <<= (7 - shift_count);
writeBits(outf);
free(c);
c = (unsigned char*)calloc(1, sizeof(char));
}
outf.close();
return total;
}
string getBits(unsigned char X)
{
stringstream itoa;
for(unsigned s = 7; s > 0 ; s--)
{
itoa << ((X >> s) & 1);
}
itoa << (X&1) ;
return itoa.str();
}
void writeBits(ofstream& outf)
{
outf << *c;
}
~bitChar()
{
if(c)
free(c);
}
};
int main()
{
ofstream outf("ssSample.dat",ios::binary);
string text ="00000110101011100010001011111110011011110101100101110101101111010111111110101011"
"00111011000011100011100000100010111110111110111001100001110001110000101001111010"
"00000101";
cout<< text<<endl;
//write to file
bitChar bchar;
bchar.setBITS(text);
bchar.insertBits(outf);
outf.close();
ifstream inf("ssSample.dat" ,ios::binary);
//READ FROM FILE
string temp=bchar.readByBits(inf);
cout << endl;
cout << temp << endl;
return 0;
}
You have a LF Line Feed character. This is the character that is getting omitted.
0000 1010
This may be unrelated, but Windows requires a CR and LF for a new line. This code may act differently in Windows vs. Unix.
Read one byte at a time.
string readByBits(ifstream& inf)
{
string s ="";
char buffer[1];
while (inf.read (buffer, 1))
{
// string strInput;
//getline(inf, strInput );
//for (int i =0 ; i < strInput.size() ; i++)
//{
s += getBits(*buffer);
//}
}
return s;
}
Program output:
000001101010111000100010111111100110111101011001011101011011110101111111101010110011101100001110001110000010001011111011111011100110000111000111000010100111101000000101
000001101010111000100010111111100110111101011001011101011011110101111111101010110011101100001110001110000010001011111011111011100110000111000111000010100111101000000101
One problem with your approach is that your text must be a multiple of 8 bits to work. Otherwise, even if everything is correct, that last character will be read from the file and converted to 8 binary digits in the string adding trailing zeros.
Two problems I quickly identified (but I assume there are more)
Your input is not a multiple of 8-bits
By using getLine you're reading until you meet a delimiting character and thus spoiling your result since you're not dealing with a text-based file

How to get the line number from a file in C++?

What would be the best way to get the line number of the current line in a file that I have opened with a ifstream? So I am reading in the data and I need to store the line number that it is on so that I can display it later if the data doesn't match the specifications.
If you don't want to limit yourself to std::getline, then you could use class derived from std::streambuf, and which keeps track of the current line number:
class CountingStreamBuffer : public std::streambuf { /* see below */ };
// open file
std::ifstream file("somefile.txt");
// "pipe" through counting stream buffer
CountingStreamBuffer cntstreambuf(file.rdbuf());
std::istream is(&cntstreambuf);
// sample usage
is >> x >> y >> z;
cout << "At line " << cntstreambuf.lineNumber();
std::getline(is, str);
cout << "At line " << cntstreambuf.lineNumber();
Here is a sample implementation of CountingStreamBuffer:
#include <streambuf>
class CountingStreamBuffer : public std::streambuf
{
public:
// constructor
CountingStreamBuffer(std::streambuf* sbuf) :
streamBuf_(sbuf),
lineNumber_(1),
lastLineNumber_(1),
column_(0),
prevColumn_(static_cast<unsigned int>(-1)),
filePos_(0)
{
}
// Get current line number
unsigned int lineNumber() const { return lineNumber_; }
// Get line number of previously read character
unsigned int prevLineNumber() const { return lastLineNumber_; }
// Get current column
unsigned int column() const { return column_; }
// Get file position
std::streamsize filepos() const { return filePos_; }
protected:
CountingStreamBuffer(const CountingStreamBuffer&);
CountingStreamBuffer& operator=(const CountingStreamBuffer&);
// extract next character from stream w/o advancing read pos
std::streambuf::int_type underflow()
{
return streamBuf_->sgetc();
}
// extract next character from stream
std::streambuf::int_type uflow()
{
int_type rc = streamBuf_->sbumpc();
lastLineNumber_ = lineNumber_;
if (traits_type::eq_int_type(rc, traits_type::to_int_type('\n')))
{
++lineNumber_;
prevColumn_ = column_ + 1;
column_ = static_cast<unsigned int>(-1);
}
++column_;
++filePos_;
return rc;
}
// put back last character
std::streambuf::int_type pbackfail(std::streambuf::int_type c)
{
if (traits_type::eq_int_type(c, traits_type::to_int_type('\n')))
{
--lineNumber_;
lastLineNumber_ = lineNumber_;
column_ = prevColumn_;
prevColumn_ = 0;
}
--column_;
--filePos_;
if (c != traits_type::eof())
return streamBuf_->sputbackc(traits_type::to_char_type(c));
else
return streamBuf_->sungetc();
}
// change position by offset, according to way and mode
virtual std::ios::pos_type seekoff(std::ios::off_type pos,
std::ios_base::seekdir dir,
std::ios_base::openmode mode)
{
if (dir == std::ios_base::beg
&& pos == static_cast<std::ios::off_type>(0))
{
lastLineNumber_ = 1;
lineNumber_ = 1;
column_ = 0;
prevColumn_ = static_cast<unsigned int>(-1);
filePos_ = 0;
return streamBuf_->pubseekoff(pos, dir, mode);
}
else
return std::streambuf::seekoff(pos, dir, mode);
}
// change to specified position, according to mode
virtual std::ios::pos_type seekpos(std::ios::pos_type pos,
std::ios_base::openmode mode)
{
if (pos == static_cast<std::ios::pos_type>(0))
{
lastLineNumber_ = 1;
lineNumber_ = 1;
column_ = 0;
prevColumn_ = static_cast<unsigned int>(-1);
filePos_ = 0;
return streamBuf_->pubseekpos(pos, mode);
}
else
return std::streambuf::seekpos(pos, mode);
}
private:
std::streambuf* streamBuf_; // hosted streambuffer
unsigned int lineNumber_; // current line number
unsigned int lastLineNumber_;// line number of last read character
unsigned int column_; // current column
unsigned int prevColumn_; // previous column
std::streamsize filePos_; // file position
};
From an ifstream point of view there is no line number. If you read in the file line by line, then you just have to keep track of it yourself.
Use std::getline to read each line in one by one. Keep an integer indicating the number of lines you have read: initialize it to zero and each time you call std::getline and it succeeds, increment it.
An inefficient but dead simple way is to have a function that given a stream, it counts the new line characters from the beginning of the stream to the current position.
int getCurrentLine(std::istream& is)
{
int lineCount = 1;
is.clear(); // need to clear error bits otherwise tellg returns -1.
auto originalPos = is.tellg();
if (originalPos < 0)
return -1;
is.seekg(0);
char c;
while ((is.tellg() < originalPos) && is.get(c))
{
if (c == '\n') ++lineCount;
}
return lineCount;
}
In some code I am working on, I am only interested to know the line number if invalid input is encountered, in which case import is aborted immediately. Since the function is called only once the inefficiency is not really a problem.
The following is a full example:
#include <iostream>
#include <sstream>
int getCurrentLine(std::istream& is)
{
int lineCount = 1;
is.clear(); // need to clear error bits otherwise tellg returns -1.
auto originalPos = is.tellg();
if (originalPos < 0)
return -1;
is.seekg(0);
char c;
while ((is.tellg() < originalPos) && is.get(c))
{
if (c == '\n') ++lineCount;
}
return lineCount;
}
void ReadDataFromStream(std::istream& s)
{
double x, y, z;
while (!s.fail() && !s.eof())
{
s >> x >> y >> z;
if (!s.fail())
std::cout << x << "," << y << "," << z << "\n";
}
if (s.fail())
std::cout << "Error at line: " << getCurrentLine(s) << "\n";
else
std::cout << "Read until line: " << getCurrentLine(s) << "\n";
}
int main(int argc, char* argv[])
{
std::stringstream s;
s << "0.0 0.0 0.0\n";
s << "1.0 ??? 0.0\n";
s << "0.0 1.0 0.0\n";
ReadDataFromStream(s);
std::stringstream s2;
s2 << "0.0 0.0 0.0\n";
s2 << "1.0 0.0 0.0\n";
s2 << "0.0 1.0 0.0";
ReadDataFromStream(s2);
return 0;
}