I'm trying to read an image into a char array. Here is my try:
ifstream file ("htdocs/image.png", ios::in | ios::binary | ios::ate);
ifstream::pos_type fileSize;
char* fileContents;
if(file.is_open())
{
fileSize = file.tellg();
fileContents = new char[fileSize];
file.seekg(0, ios::beg);
if(!file.read(fileContents, fileSize))
{
cout << "fail to read" << endl;
}
file.close();
cout << "size: " << fileSize << endl;
cout << "sizeof: " << sizeof(fileContents) << endl;
cout << "length: " << strlen(fileContents) << endl;
cout << "random: " << fileContents[55] << endl;
cout << fileContents << endl;
}
And this is the output:
size: 1944
sizeof: 8
length: 8
random: ?
?PNG
Can anyone explain this to me? Is there an end-of-file char at position 8? This example was taken from cplusplus.com
Running Mac OS X and compiling with XCode.
Returns the size of the file. size of your image.png is 1944 bytes.
cout << "size: " << fileSize << endl;
Returns the sizeof(char*), which is 8 on your environment. Note that size of any pointer is always the same on any environment.
cout << "sizeof: " << sizeof(fileContents) << endl;
The file you are reading is a binary file so it might contain 0 as a valid data. When you use strlen, it returns the length until a 0 is encountered, which in the case of your file is 8.
cout << "length: " << strlen(fileContents) << endl;
Returns the character at the 56th location (remember array indexing starts from 0) from start of file.
cout << "random: " << fileContents[55] << endl;
A suggestion:
Do remember to deallocate the dynamic memory allocation for fileContents using:
delete[] fileContents;
if you don't, you will end up creating a memory leak.
fileSize - the number of bytes in the file.
sizeof( fileContents ) - returns the size of a char* pointer.
strlen( fileContents) - counts the number of characters until a character with a value of '0' is found. That is apparently after just 8 characters - since you are reading BINARY data this is not an unexpected result.
cout << fileContents - like strlen, cout writes out characters until one with a value of '0' is found. From the output it looks like some of the characters aren't printable.
Your example has some other issues - it doesn't free the memory used, for example. Here's a slightly more robust version:
vector< char > fileContents;
{
ifstream file("htdocs/image.png", ios::in | ios::binary | ios::ate);
if(!file.is_open())
throw runtime_error("couldn't open htdocs/image.png");
fileContents.resize(file.tellg());
file.seekg(0, ios::beg);
if(!file.read(&fileContents[ 0 ], fileContents.size()))
throw runtime_error("failed to read from htdocs/image.png");
}
cout << "size: " << fileContents.size() << endl;
cout << "data:" << endl;
for( unsigned i = 0; i != fileContents.size(); ++i )
{
if( i % 65 == 0 )
cout << L"\n';
cout << fileContents[ i ];
}
This answer of mine to another question should be exactly what you are looking for (especially the second part about reading it into a vector<char>, which you should prefer to an array.
As for your output:
sizeof(fileContents) return the size of a char *, which is 8 on your system (64 bit I guess)
strlen stops at the first '\0', just as the output operator does.
What do you expect? png files are binary so they may contain '\0' character (character having numeric value 0) somewhere.
If you treat the png file contents as string ('\0' terminated array of characters) and print it as string then it will stop after encountering the first '\0' character.
So there is nothing wrong with the code, fileContents is correctly contains the png file (with size 1944 bytes)
size: 1944 // the png is 1944 bytes
sizeof: 8 // sizeof(fileContents) is the sizeof a pointer (fileContents type is char*) which is 8 bytes
length: 8 // the 9th character in the png file is '\0' (numeric 0)
random: ? // the 56th character in the png file
?PNG // the 5th-8th character is not printable, the 9th character is '\0' so cout stop here
It's a good practice to use unsigned char to use with binary data.
The character randomly selected might not be displayed properly in the console window due to the limitations in the fonts supported. Also you can verify the same thing by printing it in hexadecimal and open the same file in a hex editor to verify it. Please don't forget to delete the memory allocated after use.
Related
I am trying to read a string(ver) from a binary file. the number of characters(numc) in the string is also read from the file.This is how I read the file:
uint32_t numc;
inFile.read((char*)&numc, sizeof(numc));
char* ver = new char[numc];
inFile.read(ver, numc);
cout << "the version is: " << ver << endl;
what I get is the string that I expect plus some other symbols. How can I solve this problem?
A char* string is a nul terminated sequence of characters. Your code ignores the nul termination part. Here's how it should look
uint32_t numc;
inFile.read((char*)&numc, sizeof(numc));
char* ver = new char[numc + 1]; // allocate one extra character for the nul terminator
inFile.read(ver, numc);
ver[numc] = '\0'; // add the nul terminator
cout << "the version is: " << ver << endl;
Also sizeof(numc) not size(numc) although maybe that's a typo.
I need to create a string capable of holding the entire book 'The Hunger Games' which comes out to around 100500 words. My code can capture samples of the txt, but anytime I exceed a string size of 36603(tested), I receive a 'stack overflow' error.
I can successfully capture anything below 36603 elements and can output them perfectly.
#include <iostream>
#include <fstream>
#include <string>
using namespace std;
int main()
{
int i;
char set[100];
string fullFile[100000]; // this will not execute if set to over 36603
ifstream myfile("HungerGames.txt");
if (myfile.is_open())
{
// saves 'i limiter' words from the .txt to fullFile
for (i = 0; i < 100000; i++) {
//each word is saparated by a space
myfile.getline(set, 100, ' ');
fullFile[i] = set;
}
myfile.close();
}
else cout << "Unable to open file";
//prints 'i limiter' words to window
for (i = 0; i < 100000; ++i) {
cout << fullFile[i] << ' ';
}
What is causing the 'stack overflow' and how can I successfully capture the txt? I will later be doing a word counter and word frequency counter, so I need it in "word per element" form.
There's a limit on how much stack is used in a function; Use std::vector instead.
More here and here. The default in Visual studio is 1MB (more info here) and you can change it with /F, but this is a bad idea generally.
My system is Lubuntu 18.04, with g++ 7.3. The following snippet shows some "implementation details" of my system, and how to report them on yours. It would help you to understand what your system provides ...
void foo1()
{
int i; // Lubuntu
cout << "\n sizeof(i) " << sizeof(i) << endl; // 4 bytes
char c1[100];
cout << "\n sizeof(c1) " << sizeof(c1) << endl; // 100 bytes
string s1; // empty string
cout << "\n s1.size() " << s1.size() // 0 bytes
<< " sizeof(s1) " << sizeof(s1) << endl; // 32 bytes
s1 = "1234567890"; // now has 10 chars
cout << "\n s1.size() " << s1.size() // 10 bytes
<< " sizeof(s1) " << sizeof(s1) << endl; // 32 bytes
string fullFile[100000]; // this is an array of 100,000 strings
cout << "\n sizeof(fullFile) " // total is vvvvvvvvv
<< sops.digiComma(sizeof(fullFile)) << endl; // 3,200,000 bytes
uint64_t totalChars = 0;
for( auto ff : fullFile ) totalChars += ff.size();
cout << "\n total chars in all strings " << totalChars << endl;
}
What is causing the 'stack overflow' and how can I successfully
capture the txt?
The fullFile array is an unfortunate choice ... because each std::string, even when empty, consumes 32 bytes of automatic memory (~stack), for a total of 3,200,000 bytes, and this is with no data in the strings! This will stack overflow your system when the stack is smaller than the automatic var space.
On Lubuntu the default automatic-memory size (lately) is 10 M Bytes, so not a problem for me. But you will have to check on what your version of your target os defaults to. I think Windows defaults down near 1 M Byte. (Sorry, I don't know how to check Windows automatic-memory size.)
How can I make a string capable of capturing my entire .txt file.
The answer is -- you don't need to make your own. (unless you have some unstated requirement)
Also, you really should look at en.cppreference.com/w/cpp/string/basic_string/append".
In my 1st snippet above, you should take notice that the sizeof(string) reports 32 bytes, regardless of how many chars are in it.
Think on that a while ... if you put 1000 chars into a string, where do they go? The objects stays at 32 bytes! You might guess or read that the string object handles memory management on your behalf, and puts all characters into dynamic-memory (heap).
On my system, heap is about 4 G bytes. That's a lot more than stack.
In summary, every single std::string expands auto-magically, using heap, so if your text input will fit in heap, it will fit into '1 std::string'.
While browsing around in the cppreference, check out the 'string::reserve()' command.
Conclusion:
Any std::string you declare can auto-magically 'grow' to support your need, and will thus hold the entire text (if it will fit in memory).
Operationally, you simply get a line of text from the file, then append it to the single string, until the entire file is contained. You only need the one array, which std::string provides.
With this new idea ... I suggest you change fullFile from an array to a string.
string fullFile; // file will expand to handle append actions
// to the limit of available heap.
// open file ... check status
do {
myfile.getline(line); // fetch line of text up thru the line feed
// Note that getline does not put the \n into 'line'
// there are file state checks that should be done (perhaps here?)
// tbd - line += '\n';
// you may need the line feed in your fullFile string?
fullFile += line; // append the line
} while (!myfile.eof); // check for eof
// ... other file cleanup.
foo1() output on Lubuntu 18.04, g++ v7.3
sizeof(i) 4
sizeof(c1) 100
s1.size() 0 sizeof(s1) 32
s1.size() 10 sizeof(s1) 32
sizeof(fullFile) 3,200,000
total chars in all strings 0
Example slurp() :
string slurp(ifstream& sIn)
{
stringstream ss;
ss << sIn.rdbuf();
dtbAssert(!sIn.bad());
if(sIn.bad())
throw "\n DTB::slurp(sIn) 'ss << sIn.rdbuf()' is bad";
ss.clear(); // clear flags
return ss.str();
}
I have a char[4] dataLabel that when I say
wav.read(dataLabel, sizeof(dataLabel));//Read data label
cout << "Data label:" <<dataLabel << "\n";
I get the output Data label:data� but when I loop through each char I get the correct output, which should be "data".
for (int i = 0; i < sizeof(dataLabel); ++i) {
cout << "Data label " << i << " " << dataLabel[i] << "\n";
}
The sizeof returns 4. I'm at a loss for what the issue is.
EDIT: What confuses me more is that essentially the same code from earlier in my program works perfectly.
ifstream wav;
wav.open("../../Desktop/hello.wav", ios::binary);
char riff[4]; //Char to hold RIFF header
if (wav.is_open()) {
wav.read(riff, sizeof(riff));//Read RIFF header
if ((strcmp(riff, "RIFF"))!=0) {
fprintf(stderr, "Not a wav file");
exit(1);
}
else {
cout << "RIFF:" << riff << "\n";
This prints RIFF:RIFF as intended.
You are missing a null terminator on your character array. Try making it 5 characters and making the last character '\0'. This lets the program know that your string is done without needing to know the size.
What is a null-terminated string?
The overload of operator<< for std::ostream for char const* expects a null terminated string. You are giving it an array of 4 characters.
Use the standard library string class instead:
std::string dataLabel;
See the documentation for istream::read; it doesn't append a null terminator, and you're telling it to read exactly 4 characters. As others have indicated, the << operator is looking for a null terminator so it's continuing to read past the end of the array until it finds one.
I concur with the other suggested answer of using std::string instead of char[].
Your char[] array is not null-terminated, but the << operator that accepts char* input requires a null terminator.
char dataLabel[5];
wav.read(dataLabel, 4); //Read data label
dataLabel[4] = 0;
cout << "Data label:" << dataLabel << "\n";
Variable dataLabel is defined like
char[4] dataLabel;
that it has only four characters that were filled with characters { 'd', 'a', 't', 'a' ) in statement
wav.read(dataLabel, sizeof(dataLabel));//
So this character array does not have the terminating zero that is required for the operator << when its argument is a character array.
Thus in this statement
cout << "Data label:" <<dataLabel << "\n";
the program has undefined behaviour.
Change it to
std::cout << "Data label: ";
std::cout.write( dataLabel, sizeof( dataLabel ) ) << "\n";
I have a file which consists of 69-byte messages. No EOL characters- just message after message. The total number of bytes in the file is exactly 11,465,930,307, which is (11,465,930,307/69) = 166,172,903 messages.
My program memory-maps the file in to a byte array, looks at each 69-byte message and extracts the timestamp. I keep track of which message number I am on and then the timestamp and the message number go in a RowDetails object, which goes in a std::vector<RowDetails> called to_sort, so that I can effectively sort the whole file by timestamp.
std::cout << "Sorting....." << to_sort.size() << " rows..." << std::endl;
std::sort(std::begin(to_sort), std::end(to_sort));
However, then I create a new file which is sorted:
unsigned long long total_bytes=0;
unsigned long long total_rows=0;
ofstream a_file("D:\\sorted_all");
std::cout << "Outputting " << to_sort.size() << " rows..." << std::endl;
std::cout << "Outputting " << (to_sort.size()*69) << " bytes..." << std::endl;
for(RowDetails rd : to_sort){
for(unsigned long long i = rd.msg_number*69; i<(rd.msg_number*69)+69; i++){
a_file << current_bytes[i];
total_bytes++;
}
total_rows++;
}
std::cout << "Vector rows: "<< total_rows <<std::endl;
std::cout << "Bytes: " << total_bytes <<std::endl;
My output:
No. of total bytes (before memory-mapping file): 11,465,930,307 CORRECT
Sorting....... 166,172,903 rows CORRECT
Outputting 166,172,903 rows.... CORRECT
Outputting 11,465,930,307 bytes CORRECT
Vector rows: 166,172,903 CORRECT
Bytes: 11,465,930,169 ERROR, THIS SHOULD BE 307, not 169
How can I process the correct number of rows, but my counter, counting total bytes is wrong??
When looking at the output file in Windows 7 explorer it says size: 11,503,248,366 bytes, even though the original input file (which I memory-mapped) said the correct 11,465,930,307.
This is just a guess based on the snippet of code you have provided, but I'm willing to bet that rd.msg_number is a 32-bit type. It seems likely that rd.msg_number*69 would then sometimes overflow its 32-bit result, causing incorrect calculations in the inner loop bounds. I would do something like the following:
for(RowDetails rd : to_sort){
long long msg_offset = (long long)rd.msg_number * 69;
for(unsigned long long i = 0; i < 69; i++){
a_file << current_bytes[msg_offset+i];
total_bytes++;
}
total_rows++;
}
For the incorrect output file size, the reason is your a_file output file is opened in the default text mode, instead of binary mode. In text mode, stdio will do EOL conversion which you aren't going to want. So change the file open statement to:
ofstream a_file("d:\\sorted_all", ios::out | ios::binary);
I have this code in c++ ( it is after I did some tests to see why I can not read enough data from file, so it is not final code and I am looking to find why I am getting this result)
size_t readSize=629312;
_rawImageFile.seekg(0,ifstream::end);
size_t s=_rawImageFile.tellg();
char *buffer=(char*) malloc(readSize);
_rawImageFile.seekg(0);
int p=_rawImageFile.tellg();
_rawImageFile.read(buffer,readSize);
size_t extracted = _rawImageFile.gcount();
cout << "s="<< s <<endl;
cout << "p="<< p <<endl;
cout << "readsize="<< readSize<<endl;
cout << "extracted="<< extracted <<endl;
cout << "eof ="<< _rawImageFile.eofbit<<endl;
cout << "fail="<< _rawImageFile.failbit <<endl;
The output is as follow:
s=3493940224
p=0
readsize=629312
extracted=2085
eof =1
fail=2
As you can see the file size is 3493940224 and I am at the start of file (p=0) and I am trying to read 629312 bytes, but I can only read 2085?
What is the problem with this code? I did open this file in other methods and read some data out of it, but am using seekg to move pointer to the beginning of file.
The file was opened as binary.
edit 1
To find a solution, I put all code inside a function and here is it:
_config=config;
ifstream t_rawImageFile;
t_rawImageFile.open(rawImageFileName,std::ifstream::in || std::ios::binary );
t_rawImageFile.seekg (0);
size_t readSize=629312;
t_rawImageFile.seekg(0,ifstream::end);
size_t s=t_rawImageFile.tellg();
char *buffer=(char*) malloc(readSize);
t_rawImageFile.seekg(0);
size_t p=t_rawImageFile.tellg();
t_rawImageFile.read(buffer,readSize);
size_t x=t_rawImageFile.tellg();
size_t extracted = t_rawImageFile.gcount();
cout << "s="<< s <<endl;
cout << "p="<< p <<endl;
cout << "x="<< x <<endl;
cout << "readsize="<< readSize<<endl;
cout << "extracted="<< extracted <<endl;
cout << "eof ="<< t_rawImageFile.eof()<<endl;
cout << "fail="<< t_rawImageFile.fail() <<endl;
and the result is:
s=3493940224
p=0
x=4294967295
readsize=629312
extracted=2085
eof =1
fail=1
Interestingly, after read the file pointer moves to a very big value. is it possible that since the file size is very big, the application fails?
edit 2
Tested the same code with another file. the result is as follow:
s=2993007872
p=0
x=4294967295
readsize=629312
extracted=1859
eof =1
fail=1
What I can read from this test is that:
after read the file pointer moves to a big number which is always the same. The amount that it reads depend on file (!).
edit 3
After changing the size_t to fstream::pos_type the result is as follow:
s=2993007872
p=0
x=-1
readsize=629312
extracted=1859
eof =1
fail=1
Why file position goes to -1 after a read?
t_rawImageFile.open(rawImageFileName, std::ifstream::in || std::ios::binary );
...does not open the file in binary mode. Since || is the lazy or operator and std::ifstream::in is non zero, the whole expression has the value 1.
t_rawImageFile.open(rawImageFileName, std::ifstream::in | std::ios::binary );
...will surely work better.
You don't show the part where your file is being opened, but I'm pretty sure it is missing ios::binary to make sure the C runtime code doesn't interpret CTRL-Z (or CTRL-D) as end of file.
Change this line:
t_rawImageFile.open(rawImageFileName,std::ifstream::in || std::ios::binary );
into this:
t_rawImageFile.open(rawImageFileName,std::ifstream::in | std::ios::binary );