Why is this fwrite writing garbage? - c++

I'm building a program that takes in serial data and saves it to file. Each line of data is timestamped. In this code, the timestamped line of data is s.
string s = get_timestamp();
cout << "input string named s is: " << s << "\n";
numChars = sizeof(s);
cout << "size is: " << numChars << "\n";
fwrite( &s, sizeof(char) , numChars , DATA_LOG);
The print statements output
00000.27m,379named s is: 20130822.1141,00000.26m,379
size is: 28
You can see that for some reason the "input string named s" seems to be overwritten. This isn't really my main concern though (though I don't know why it's happening.)
My main problem is that my fwrite saves garbage to file. You can see that the numChars and string are correct. I've tried in place of "&s", "static_cast(&s)" with the same garbage results. Any ideas?

First of all, I suspect s contains some carriage returns. This causes the cursor to move the beginning of the line, with further output overwriting what's already been printed. To see the actual character that get printed, redirect the output of your program to a file, and then use a hex editor/viewer (e.g. xxd) to examine the result.
Secondly, sizeof(s) is not the right way to determine the length of a std::string. Use s.length() instead. This is why numChars is incorrect.
Lastly, to write the string to the file, use:
fwrite( s.data(), sizeof(char) , s.length() , DATA_LOG);

Writing a std::string s to a file would look something like this:
fwrite(s.c_str(), 1, s.size(), DATA_LOG);
There may be other issues with your data, looking at the console printout, but I'm not sure without seeing the actual data in a debugger or similar.

sizeof(s) should really be strlen(s) or wcslen(s).
Or if you're using the std::basic_string<>, .length() will give you the length and .c_str() will give you the char string pointer, not &s which is the pointer to the actual object.
So try:
if(!s.empty()){
fwrite(s.c_str(), sizeof(s.front()), s.length(), DATA_LOG);
}

Related

C++ writing to file producing garbage data [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 6 years ago.
Improve this question
I am trying to add data into a text file. However, i see for some reason it produces garbage data. I also notice, it will input the correct data once, but then it follow with garbage data.
void TextFileLogger::log(std::string msg){
using namespace std;
//ofstream output_file("students.data", ios::binary);
std::ofstream logFile;
// creating, opening and writing/appending data to a file
char filename[] = "log.txt";
logFile.open(filename, ios::binary | ios::app |ios::out);
if (logFile.fail())
{
std::cout << "The " << filename << " file could not be created/opened!" << std::endl;
// 0-normal, non zero - some errors
}
else
{
if (!logFile.write((char*)&msg, sizeof(msg)))
{
cout << "Could not write file" << endl;
}
else
{
streamsize bytesWritten = logFile.tellp();
if (bytesWritten != sizeof(msg))
{
cout << "Could not write expected number of bytes" << endl;
}
else
{
logFile << msg << std::endl;
cout << "file written OK" << endl;
}
}
}
}
That one is fun!
(char*)&msg does not do what you expect: std::string is mainly a pointer to a dynamically-allocated buffer which contains the actual data. When you take the std::string's address and try to read what's inside, you get a view of its innards, not its data. Using a C++ static_cast here would have spared you the trouble by telling you that the conversion makes no sense. sizeof(msg) similarly returns the size of the std::string, not the length of its data.
So, your solution is: use msg.data() and msg.size(), it's exactly what they're designed for.
But... why would it (sometimes) output your string, and a bunch of garbage? Well, std::strings typically use SSO (Small String Optimization). The std::string actually contains a small buffer, to store short enough strings without dynamic allocation. When you inspect the whole std::string object, you see this buffer pass by.
You are writing the contents of the whole std::string object, with all the member variables that it contains internally.
You either want:
logFile << msg;
or if you really want to use write():
logFile.write( msg.c_str(), msg.length());
And, I wonder: Why do create/open the file in binary mode, when you write strings afterwards?
And finally, you write the data twice, the second time in your last else clause.
The problem is with this line:
if (!logFile.write((char*)&msg, sizeof(msg)))
It should be this:
if (!logFile.write(msg.c_str(), msg.length()))
Since you are passing a std::string into the function, you should take advantage of the functions it provides (c_str() and length()) instead of trying to cast it to a char* (this always gets messy, plus you are casting away the const, which is also typically bad).
This:
if (!logFile.write((char*)&msg, sizeof(msg)))
is wrong in so many ways. msg is not an array of char, it's a std::string - lying to the compiler by using a cast is always a bad thing to do. And the size of a string is not the size of the characters it contains. Why the heck are you not using the obvios:
logfile << msg << std::endl;
Replace sizeof(msg) with msg.size(), sizeof() is not doing what you think!
Also (char*)&msg does not do whatever you think, use msg.data() instead.
logFile.write((char*)&msg, sizeof(msg));
should be rewritten to:
logFile.write(msg.data(), msg.size());
or, even better, because ofstream overrides operator<< for std::string:
logfile << msg;

How to get consistent responses from fstream?

When I read in information via fstream, it has ocurred twice in two different programs, that the input given to my program isn't stable, even if a given file doesn't change.
In my most recent program, which is concerned with audio-reading. I'm doing a simple check on the first four letters in the file. These letters are supposed to be RIFF, which they also are - I checked.
So, in order to check the format of a given binary file, I buffer the first four letters and see if they are equal to 'RIFF'.
char buffer[4];
std::ifstream in(fn,std::ios::binary);
in.read(buffer,4);
if(buffer!="RIFF"){//Always wrong atm
std::cout << "INVALID WAV FILE: " << buffer << std::endl;
}
When I first made the program, I recall this working properly. Now though, I get an error via my own cout:
INVALID WAV FILE: RIFFýfK
Does anyone have any idea as to what has gone wrong? Perhaps a way to make fstream more consistent?
You're reading 4 characters but not adding a zero terminator, furthermore your comparison is wrong since you're not comparing strings equality, you should rather do:
char buffer[5];
std::ifstream in(fn, std::ios::binary);
in.read(buffer, 4);
buffer[4] = '\0'; // Add a zero-terminator at the end
if (strcmp(buffer,"RIFF")) { // If buffer isn't {'R','I','F','F','\0'}..
std::cout << "INVALID WAV FILE: " << buffer << std::endl;
}

How could I copy data that contain '\0' character

I'm trying to copy data that conatin '\0'. I'm using C++ .
When the result of the research was negative, I decide to write my own fonction to copy data from one char* to another char*. But it doesn't return the wanted result !
My attempt is the following :
#include <iostream>
char* my_strcpy( char* arr_out, char* arr_in, int bloc )
{
char* pc= arr_out;
for(size_t i=0;i<bloc;++i)
{
*arr_out++ = *arr_in++ ;
}
*arr_out = '\0';
return pc;
}
int main()
{
char * out= new char[20];
my_strcpy(out,"12345aa\0aaaaa AA",20);
std::cout<<"output data: "<< out << std::endl;
std::cout<< "the length of my output data: " << strlen(out)<<std::endl;
system("pause");
return 0;
}
the result is here:
I don't understand what is wrong with my code.
Thank you for help in advance.
Your my_strcpy is working fine, when you write a char* to cout or calc it's length with strlen they stop at \0 as per C string behaviour. By the way, you can use memcpy to copy a block of char regardless of \0.
If you know the length of the 'string' then use memcpy. Strcpy will halt its copy when it meets a string terminator, the \0. Memcpy will not, it will copy the \0 and anything that follows.
(Note: For any readers who are unaware that \0 is a single-character byte with value zero in string literals in C and C++, not to be confused with the \\0 expression that results in a two-byte sequence of an actual backslash followed by an actual zero in the string... I will direct you to Dr. Rebmu's explanation of how to split a string in C for further misinformation.)
C++ strings can maintain their length independent of any embedded \0. They copy their contents based on this length. The only thing is that the default constructor, when initialized with a C-string and no length, will be guided by the null terminator as to what you wanted the length to be.
To override this, you can pass in a length explicitly. Make sure the length is accurate, though. You have 17 bytes of data, and 18 if you want the null terminator in the string literal to make it into your string as part of the data.
#include <iostream>
using namespace std;
int main() {
string str ("12345aa\0aaaaa AA", 18);
string str2 = str;
cout << str;
cout << str2;
return 0;
}
(Try not to hardcode such lengths if you can avoid it. Note that you didn't count it right, and when I corrected another answer here they got it wrong as well. It's error prone.)
On my terminal that outputs:
12345aaaaaaa AA
12345aaaaaaa AA
But note that what you're doing here is actually streaming a 0 byte to the stdout. I'm not sure how formalized the behavior of different terminal standards are for dealing with that. Things outside of the printable range can be used for all kinds of purposes depending on the kind of terminal you're running... positioning the cursor on the screen, changing the color, etc. I wouldn't write out strings with embedded zeros like that unless I knew what the semantics were going to be on the stream receiving them.
Consider that if what you're dealing with are bytes, not to confuse the issue and to use a std::vector<char> instead. Many libraries offer alternatives, such as Qt's QByteArray
Your function is fine (except that you should pass to it 17 instead of 20). If you need to output null characters, one way is to convert the data to std::string:
std::string outStr(out, out + 17);
std::cout<< "output data: "<< outStr << std::endl;
std::cout<< "the length of my output data: " << outStr.length() <<std::endl;
I don't understand what is wrong with my code.
my_strcpy(out,"12345aa\0aaaaa AA",20);
Your string contains character '\' which is interpreted as escape sequence. To prevent this you have to duplicate backslash:
my_strcpy(out,"12345aa\\0aaaaa AA",20);
Test
output data: 12345aa\0aaaaa AA
the length of my output data: 18
Your string is already terminated midway.
my_strcpy(out,"12345aa\0aaaaa AA",20);
Why do you intend to have \0 in between like that? Have some other delimiter if yo so desire
Otherwise, since std::cout and strlen interpret a \0 as a string terminator, you get surprises.
What I mean is that follow the convention i.e. '\0' as string terminator

Insert values into a string without using sprintf or to_string

Currently I only know of two methods to insert values into a C++ string or C string.
The first method I know of is to use std::sprintf() and a C-string buffer (char array).
The second method is to use something like "value of i: " + to_string(value) + "\n".
However, the first one needs the creation of a buffer, which leads to more code if you just want to pass a string to a function. The second one produces long lines of code, where a string gets interrupted every time a value is inserted, which makes the code harder to read.
From Python I know the format() function, which is used like this:
"Value of i: {}\n".format(i)
The braces are replaced by the value in format, and further .format()'s can be appended.
I really like Python's approach on this, because the string stays readable, and no extra buffer needs to be created. Is there any similar way of doing this in C++?
Idiomatic way of formatting data in C++ is with output streams (std::ostream reference). If you want the formatted output to end up in a std::string, use an output string stream:
ostringstream res;
res << "Value of i: " << i << "\n";
Use str() member function to harvest the resultant string:
std::string s = res.str();
This matches the approach of formatting data for output:
cout << "Value of i: " << i << "\n";

What's the difference between putting std::string and std::string::c_str() into a stringstream?

We're seeing a strange scenario that basically boils down to the following:
std::string something = "someval";
std::stringstream s;
s << something;
std::cout << s.str();
is not equal to:
std::string something = "someval";
std::stringstream s;
s << something.c_str();
std::cout << s.str();
Taking that a step farther - the output is not gibberish in either case. What is happening is the output from case 1 appears to be mapped to another (valid) string in the system whereas the output from case 2 is what is expected.
We see this behavior by simply changing:
s << something;
To:
s << something.c_str();
I know this sounds crazy (or it does to me), and I haven't been able to replicate it out of the larger system - so sorry for no "working" example. But does anyone know how this kind of thing can happen? Can we be stepping on memory somewhere or doing something to a stringtable in some location or anything else like that?
It is different if the string contains nul characters, '\0'.
The .c_str() version will compute the length up to the nul, while the std::string output will know its length and output all its characters.