Possible reasons for tellg() failing? - c++

ifstream::tellg() is returning -13 for a certain file.
Basically, I wrote a utility that analyzes some source code; I open all files alphabetically, I start with "Apple.cpp" and it works perfectly.. But when it gets to "Conversion.cpp", always on the same file, after reading one line successfully tellg() returns -13.
The code in question is:
for (int i = 0; i < files.size(); ++i) { /* For each .cpp and .h file */
TextIFile f(files[i]);
while (!f.AtEof()) // When it gets to conversion.cpp (not on the others)
// first is always successful, second always fails
lines.push_back(f.ReadLine());
The code for AtEof is:
bool AtEof() {
if (mFile.tellg() < 0)
FATAL(format("DEBUG - tellg(): %d") % mFile.tellg());
if (mFile.tellg() >= GetSize())
return true;
return false;
}
After it reads successfully the first line of Conversion.cpp, it always crashes with DEBUG - tellg(): -13.
This is the whole TextIFile class (wrote by me, the error may be there):
class TextIFile
{
public:
TextIFile(const string& path) : mPath(path), mSize(0) {
mFile.open(path.c_str(), std::ios::in);
if (!mFile.is_open())
FATAL(format("Cannot open %s: %s") % path.c_str() % strerror(errno));
}
string GetPath() const { return mPath; }
size_t GetSize() { if (mSize) return mSize; const size_t current_position = mFile.tellg(); mFile.seekg(0, std::ios::end); mSize = mFile.tellg(); mFile.seekg(current_position); return mSize; }
bool AtEof() {
if (mFile.tellg() < 0)
FATAL(format("DEBUG - tellg(): %d") % mFile.tellg());
if (mFile.tellg() >= GetSize())
return true;
return false;
}
string ReadLine() {
string ret;
getline(mFile, ret);
CheckErrors();
return ret;
}
string ReadWhole() {
string ret((std::istreambuf_iterator<char>(mFile)), std::istreambuf_iterator<char>());
CheckErrors();
return ret;
}
private:
void CheckErrors() {
if (!mFile.good())
FATAL(format("An error has occured while performing an I/O operation on %s") % mPath);
}
const string mPath;
ifstream mFile;
size_t mSize;
};
Platform is Visual Studio, 32 bit, Windows.
Edit: Works on Linux.
Edit: I found the cause: line endings. Both Conversion and Guid and others had \n instead of \r\n. I saved them with \r\n instead and it worked. Still, this is not supposed to happen is it?

It's difficult to guess without knowing exactly what's in Conversion.cpp. However, using < with stream positions is not defined by the standard. You might want to consider an explicit cast to the correct integer type before formatting it; I don't know what formatting FATAL and format() expect to perform or how the % operator is overloaded. Stream positions don't have to map in a predicatable way to integers, certainly not if the file isn't opened in binary mode.
You might want to consider an alternative implementation for AtEof(). Say something like:
bool AtEof()
{
return mFile.peek() == ifstream::traits_type::eof();
}

Related

Can I safely use std::string to assemble binary data into messages?

I am using a std::string to hold binary data read from a socket.
The data consists of messages beginning with a '$' and ending with a '#'. Each message may contain '\0' characters.
I use std::string::find() to find the location of the first message and extract it from the string using std::string::substr():
class MessageSplitter {
public:
MessageSplitter() { m_data.reserve(1'000'000); }
void appendBinaryData(const std::string& binaryData) {
m_data.append(bytes);
}
bool popMessage(std::string& msg) {
size_t beg_index = m_data.find("$");
if (beg_index == std::string::npos) {
return false;
}
size_t end_index = m_data.find("#", beg_index);
if (end_index == std::string::npos) {
return false;
}
size_t count = end_index - beg_index + end.size();
msg = m_data.substr(beg_index, count);
m_data = m_data.substr(end_index + end.size());
return true;
}
private:
std::string m_data;
};
I read from socket this way (error checking on recv omitted):
char buffer[4096];
int ret = ::recv(m_socket, buffer, 4096, 0);
std::string binaryData = std::string(buffer, ret);
This approach seems to work fine on Windows.
However is it guaranteed to work on other platforms according to the C++ standard?
This is perfectly safe from a language level. std::string is guaranteed to be able to handle non-printable characters including embedded nul characters just fine.
From a programmer's prospective though it's somewhat unsafe because it's surprising. When I see std::string I generally expect it to be printable text. It has an operator<< for example to make it easy to print to output streams, and I have to remember never to use that.
For the second reason, I would tend to prefer something more explicit. std::vector<std::byte> or std::vector<unsigned char> or similar. Something that doesn't act like text is much more difficult to accidentally treat as text.

fprintf fails on specific string while shell functions use it with no crash

The short version: I have a C++ code that uses a C call to fprintf(stdout, some_cpp_str.c_str()) and crashes during it. The first 4 calls are fine, only the 5th crashes, and I have no idea why (suspecting unreadable char inside the string). The 1st code I posted was mostly C, so I posted another one, with only C++ except for the fprintf (code added at the bottom of the question). The crashes occur (consistently) on an embedded device. On my own PC the code runs fine
The long version:
I have a code that reads lines from text, and pushes them into a string vector. TO check my progress, I also fprintf them to the screen after the vector is populated:
int main(){
char err_msg[256], * line = NULL, *in_file = "...", *keyword = "blah";
size_t len = 0;
ssize_t num_bytes_read;
int i = 1;
std::vector<std::string> lines_vector;
FILE * fp = fopen(in_file, "r");
if (!fp) {
fprintf(stdout,"can't open file %s for reading\n", in_file);
goto EXIT;
}
while ((num_bytes_read = getline(&line, &len, fp)) != -1) {
/* if found keyword inside line */
if (strstr(line, keyword)) {
/* add 3 lines (entry heading, entry body, newline)*/
lines_vector.push_back(std::string(line));
for(int lines_to_copy = 2; lines_to_copy > 0; lines_to_copy--) {
if((num_bytes_read = getline(&line, &len, fp)) == -1) {
fprintf(stdout,"can't read line from %s\n", in_file);
goto EXIT;
}
lines_vector.push_back(std::string(line));
}
}
}
fprintf(stdout,"finished reading from file\n");
EXIT:
fclose(fp);
free(line);
for (std::vector<std::string>::iterator it = lines_vector.begin() ; it != lines_vector.end(); ++it, ++i) {
fprintf(stdout, "%d)", i);
fprintf(stdout, "%s", (*it).c_str());
}
return 0;
}
This works fine on my VM, but I also run it on an embedded device, where it always crashes on a specific line. The line is:
certificates local generate name localcert common-name sf country(region) AB auto-regenerate-days 12 auto-regenerate-days-warning 11 e-mail X#Y.com locality(city) Z organization Q organization-unit T scep-password-string 57E6CA35452E72E4D1BC4518260ABFC7 scep-url http://0.0.0.0/X/Y/ state(province) s
I don't think there is a problem in the line itself (as it doesn't crash on my VM). When trying to print it to a file instead of to the screen, it doesn't crash:
for (std::vector<std::string>::iterator it = lines_vector.begin(); it != lines_vector.end(); ++it){
sprintf(tmp, "echo \"%s\" >> /X/Y/Z.txt", (*it).c_str());
OS_run(tmp); // run this command on sh shell
}
Since it crashes only on my embedded and not my VM, I thought the file is somehow corrupted. Could it be that the string has an invalid char inside that crashes fprintf, but not echo?
I tried translating this code into proper C++, but I still get a crash in the middle of the last string. I know mixing C/C++ is not good, but shouldn't c_str() be a proper interface between std::string and char * (which fprintf expects)?
If not this, then what could possibly crash during the fprintf?
int main()
{
std::vector<std::string> lines_vector;
std::ifstream infile(in_file);
std::string line;
int counter = 1;
while (std::getline(infile, line)) {
if (line.find(keyword, 0) != std::string::npos) {
lines_vector.push_back(line);
for(int lines_to_copy = 2; lines_to_copy > 0; lines_to_copy--) {
std::getline(infile, line);
lines_vector.push_back(line);
}
}
}
for (std::vector<std::string>::iterator it = lines_vector.begin(); it != lines_vector.end(); ++it){
fprintf(stdout, "%d)%s", counter++, (*it).c_str());
}
}
On an embedded device, you can expect that dynamic memory allocation fail. That means that you absolutely must control all possible allocations (you should anyway even on non embedded device, but the crash risk is much lower...). You really should have:
while ((num_bytes_read = getline(&line, &len, fp)) != -1) {
...
}
if (line == NULL) {
perror("getline could not allocate buffer");
}
This will not fix anything, but at least you will know what happens.
I have respected your coding style here, making heavy use of the C library and also using goto. But I must advise you not to do that in C++ programs.
C library used to be included in C++ standard library because early C++ implementations were lacking too many functionalities. In modern C++ goto is to be banned, as are all raw C strings and C io functions (except in very special use cases). And C++ come with a version of getline (in header <string>) that directly fills a std::string. You really should try to avoid C construct if learning C++.
Per Ben Voigt's comment, there are correct use case to use old style C library if you want to avoid dynamic allocation. But in that case, you should also avoid std::string and std::vector

C++ Linux /dev/* fwrite/fread fails but write/read succeeds

I'm writing to the /dev interface of a hardware device on linux. The /dev interface is presented as a linux file, to talk to the device you simply read and write the file. I am using std c++ file wrappers std::fwrite and std::fread because i need access to the file underlying file descriptor for ioctl calls, which is not exposed with the prefered std::ofstream but i digress.
The issue is simple, a write followed by a read fails when using the std:: * calls. It appears to be an issue with fseek but I am unsure. With the fseek code as shown below, successive writes return as if they are a success but no data is written, without fseek code the std::fread call returns an error value. Curiously the linux file functions (write and read) work perfectly, without any fseek mess or anything at all. My question is WHY!?
Linux function version (works perfectly):
bool Write(const std::vector<T> &data)
{
if(write(GetFileDescriptor(),&data[0],sizeof(T) * data.size()) ==
sizeof(T) * data.size())
return true;
return false;
}
std::vector<T> Read(int CountOfT)
{
std::vector<T> buf(CountOfT);
if(read(GetFileDescriptor(), &buf[0], sizeof(T) * CountOfT) !=
sizeof(T) * CountOfT)
throw "stuff"; //i actually use std::optional
return buf;
}
STD Version (fails)
bool Write(const std::vector<T> &data)
{
if(std::fwrite(data.data(), sizeof(T), data.size(), m_fd.get()) <
data.size())
return false;
return true;
}
std::vector<T> Read(int CountOfT)
{
long fileoffset = std::ftell(m_fd.get()); //get current offset
std::fseek(m_fd.get(),0,SEEK_SET); //place offset at file start
std::vector<T> buf(CountOfT);
if(std::fread(&buf[0],sizeof(T),buf.size(),m_fd.get()) < CountOfT)
throw "stuff";
std::fseek(m_fd.get(),fileoffset,SEEK_SET); //reset to where it was
return buf;
}

Byte output to binary file C++

I'm writing Huffman coding and everything was OK, until I tried to save the result into the archived file. Our teacher offered us to do it with such function (it takes each time a bit and after taking 8 of them should output a byte):
long buff=0;
int counter=0;
std::ofstream out("output", std::iostream::binary);
void putbit(bool b)
{
buff<<=1;
if (b) buff++;
counter++;
if (counter>=8)
{
out.put(buff);
counter=0;
buff=0;
}
}
I tried an example with inputting sequence of bits like this:
0011001011001101111010010001000001010101101100
but the output file in binary mode includes just: 1111111
As buff variable has the correct numbers (25 102 250 68 21 108) I suggested that I wrote the code in my notebook incorrectly and something is wrong with this line:
out.put(buff);
I tried to remove it with this line:
out << buff;
but got: 1111111111111111
Another way was:
out.write((char *) &buff, 8);
which gives:
100000001000000010000000100000001000000010000000
It look like the closest to the correct answer, but still doesn't work correctly.
Maybe I don't understand something about file output.
Question:
Could you explain me how to make it work and why previous variants are wrong?
UPD:
The input comes from this function:
void code(std::vector<bool> cur, std::vector<bool> sh, std::vector<bool>* codes, Node* r)
{
if (r->l)
{
cur.push_back(0);
if (r->l->symb)
{
putbit(0);
codes[(int)r->l->symb] = cur;
for (int i=7; i>=0; i--)
{
if ((int)r->l->symb & (1 << i))
putbit(1);
else putbit(0);
}
}
else
{
putbit(0);
code(cur, sh, codes, r->l);
}
cur.pop_back();
}
if (r->r)
{
cur.push_back(1);
if (r->r->symb)
{
putbit(1);
codes[(int)r->r->symb] = cur;
for (int i=7; i>=0; i--)
{
if ((int)r->r->symb & (1 << i))
putbit(1);
else putbit(0);
}
}
else
{
putbit(1);
code(cur, sh, codes, r->r);
}
cur.pop_back();
}
}
The thing is, your putbit function is working (though its terrible, you use globals and your buffer should be a char).
For example, this is how I tested your function.
out.open( "outfile", std::ios::binary );
if ( out.is_open() ) {
putbit(1);
putbit(1);
putbit(0);
putbit(1);
putbit(0);
putbit(1);
putbit(0);
putbit(0);
out.close();
}
This should ouput 1101 0100 or d4 in hex.
I believe this an XY problem. The problem you're trying to solve is not in the putbit function but rather on the way you use it and in your algorithm.
You said that you had the right values before putting your data to the output file. There are many similar questions to your in stackoverflow, just look for them.
The real problem is that the putbit function is not enough to solve your problems. You rely of the fact that it will write a byte after you call it 8 times. What if you write less than 8 bytes? Also, you never flush your file (at least in the code you posted) so there's no guarantee that all data will be written.
First you must understand how file handles (streams) work. Open your file locally, check if it's open and close it when you're done. Closing also guarantees that all data in the file buffer is written to the file.
outfile.open( "output", std::ios::binary );
if ( outfile.is_open() ) {
// ... use file ...
outfile.close();
}
else {
// Couldnt open file!
}
Other questions solve this by writing, or using, a BitStream. It would look somewhat like this,
class OutBitstream {
public:
OutBitstream();
~OutBitstream(); // close file
bool isOpen();
void open( const std::string &file );
void close(); // close file, also write pending bits
void writeBit( bool b ); // or putbit, use the names you prefer
void writeByte( char c );
void writePendingBits(); // write bits in the buffer they may
// be less than 8 so you may have to do some padding
private:
std::ofstream _out;
char _bitBuffer; //or std::bitset<8>
int _numbits;
};
With this interface it should be easier to handle bit input. No globals as well. I hope that helps.

How to get the return value between two different c++ program using Visual Studio 2005

Now, I have two programs A and B. Program A uses system() to execute program B.
But, the program B uses writing file way to return its execute result.
Has program A a better way to get the return value of program B?
For example
In program A
int main(){
system("B.exe");
readFile(finePath);
//do something
return 0;
}
In program B
int main(){
char temp[1024];
//do something
writeFile(temp);
return 0;
}
Pipes are a relatively simple, cross-platform way to do this without creating temporary files all over the place and having to deal with the additional issues that potentially arise from doing that.
static string pcommand(const string& cmd)
{
FILE* stream = _popen(cmd.c_str(), "r");
string data;
if (stream)
{
while (!feof(stream))
{
const int buffer_size = 256;
char buffer[buffer_size];
if (fgets(buffer, buffer_size, stream))
data.append(buffer);
}
_pclose(stream);
}
return data;
}
int main()
{
string 'str' = pcommand("dir");
// 'str' now contains the results sent to stdout
}
Method 1.
Try using the ERRORLEVEL system variable to check the return value of any running program.
Note:
ERRORLEVEL is a system variable so use it as such... ;)
Method 2.
You can use Process.ExitCode property.