Need some Help Reading File - c++

I need to read a file (not in binary mode). I already have a code to know the size of the file, and what i am searching for is how to read the file by = (the size of the file)-8276 bytes. These bytes that have been read, will be stored in a variable and I will need it to be written.
The size of the file is stored in an unsigned long variable. Can anybody help me?
I use Borland C++

Try this. Its been a while since I've touched Borland, so syntax might be off a bit. Consider it pseudocode, but you get the concept.
// assuming you've already created the file handle.
HANDLE fileHandle;
unsigned long fileSize;
unsigned long numBytesRead;
bool result;
// get the file size
fileSize = GetFileSize(theFile, NULL);
// check to see if filesize is greater than 8276 bytes.
// if so, read (fileSize - 8276)
if(fileSize >= 8276)
{
result = ReadFile(fileHandle, &objectYouAreReadingItTo, (fileSize - 8276), numBytesRead);
}
else
{
//...handle when fileSize is less than 8276 bytes...
}

Related

Is readsome() appropriate to read binary data on Windows?

Context: I am trying to read the content of a PNG picture in C++ to send it later to my Android app. To do so, I open the file in binary mode, read it's content by chuncks of 512 bytes, then send the data to the app. I'm on Windows.
Issue: I use an ifstream instance and the readsome() function as shown below, and it returns me 512, which is what I expected since I asked to read 512 bytes. However, it seems that I am far from really having 512 bytes in my buffer, which confuses me. While I debug my programm step by step, the number of char in the buffer seems random, but is never 512 as expected.
Code:
int currentByteRead = 0;
std::ifstream fl(imgPath.toStdString().c_str(), ios_base::binary);
fl.seekg( 0, std::ios::end );
int length = fl.tellg();
char *imgBytes = new char[512];
fl.seekg(0, std::ios::beg);
// Send the img content by blocks of 512 bytes
while(currentByteRead + 512 < length) {
int nbRead = fl.readsome(imgBytes, 512); // nbRead is always set to 512 here
if(fl.fail()) {
qDebug() << "Error when reading file content";
}
sendMessage(...);
currentByteRead += 512;
imgBytes = new char[512];
}
// Send the remaining data
int nbRemainingBytes = length - currentByteRead;
fl.readsome(imgBytes, nbRemainingBytes);
sendMessage(...);
fl.close();
currentByteRead += nbRemainingBytes;
The length I get at the beginning is the correct one, and it seems there is no error. But it is as if not all the data was copied into the buffer during the readsome() call.
Questions: Did I misunderstood something about the readsome() function ? Is there something related to Windows causing this behaviour ? Is there a more appropriate way to proceed ?
I finally found a way to do what I wanted, and as suggested by David Herring I will put here my answer.
My thoughts about the issue: If I use a std::ifstream::pos_type variable instead of an int, the correct number of bytes is read and put in the buffer. This was not the case when using an int, as if the chars were only written in the buffer until a given (random ?) point. I am not sure to understand why this behavior occurred. My guess was that I had issues with '\n' characters, but the randomness of the final content of the buffer is still unclear for me.
Correction: This is the working code I finally reached nonetheless. Starting with this, I was able to do what I had in mind.
std::ifstream ifs(imgPath.toStdString().c_str(), std::ios::binary|std::ios::ate);
std::ifstream::pos_type pos = ifs.tellg();
int length = ifs.tellg();
std::vector<char> result(pos);
ifs.seekg(0, std::ios::beg);
ifs.read(result.data(), pos);
ifs.close();
I hope this will help others. Thank you David for your suggestions.

Incorrect size of file found using Visual Studio C++

I am porting over c++ code from linux to windows. I am currently using Visual Studio 2013 to port my code.
I need to read a binary file and am using this portion of c++ code:
// Open the stream
std::ifstream is("myfile.bin");
// Determine the file length
is.seekg(0, std::ios_base::end);
std::size_t size=is.tellg();
is.seekg(0, std::ios_base::begin);
// Create a vector to store the data
int* Data = new int[size/sizeof(int)];
// Load the data
is.read((char*) &Data[0], size);
// Close the file
is.close();
In linux, the size of my binary file is correctly found to be 744mb. However, in windows, the size of my binary file is incorrectly found to be >4GB. How can I correct this issue?
Change std::ifstream is("myfile.bin"); to std::ifstream is("myfile.bin", std::ios::binary);
With your current default open mode, the compiler choses "char" mode. In Linux chars in files are UTF8, first 128 positions are 1-byte char. But for memory UTF32, 4-bytes per char, is used. In Windows chars are "wide-chars", 2-bytes per char.
I finally had the time to actually run this myself, though I had to fix a couple of things, like ios_base::beg instead of begin (different function) Also, as mentioned, the array allocation should be this int* Data = new int[size / sizeof(int) + 1]; // At most one extra int
I found your problem: you're not in the right directory. Check if you successfully opened the file or not. If you don't, then you get a huge garbage value (probably -1, but unsigned, so massive) for size.
Try this to find your directory in Windows: (probably need Windows.h or something that I "just had" already)
char dirBuf[256];
GetCurrentDirectory(256, dirBuf);
cout << "Current directory is: " << dirBuf << endl;
See if that's where your file is and move it accordingly. Or specify the ENTIRE path in the constructor to ifstream.
Also, it has nothing to do with ios::binary or not. Works fine both ways, or fails if the file isn't there.
std::size_t size=is.tellg();
The standard doesn't require tellg to return the byte offset from the beginning of the file. In general, this may not be a reliable way to get the size of the file, though it probably does what you expect on Linux and Windows.
The return type of the tellg method is std::basic_stream::pos_type, so you're starting with an implicit conversion to std::size_t which may or may not be appropriate. In a 32-bit build, for example, it's conceivable that the size of a file could be larger than a std::size_t can represent.
But the root problem is that you're not checking for errors. If you have exceptions disabled, then tellg reports an error by returning pos_type(-1). When you cast that to an unsigned type (which std::size_t is), then you get a very large value. I suspect you failed to open the file, and since you didn't detect that error, the seekg and the tellg failed. You then coerced pos_type(-1) to a std::size_t, which made it look like the file was huge.
You also have the problems others have noted: failing to open the file in binary mode and computing the wrong size for the buffer when the file isn't a multiple of the size of an int.
The most reliable to get the file size is to use the OS's API. On Windows, you can do this instead:
// Open the file. [TODO: Get the file name in wide characters and use
// CreateFileW instead. If the file name contains characters not
// representable by the user's ANSI codepage, then CreateFileA will fail.]
HANDLE hfile = CreateFileA("myfile.bin", GENERIC_READ, FILE_SHARE_READ,
nullptr, OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL | FILE_FLAG_SEQUENTIAL_SCAN,
nullptr);
if (hfile == INVALID_HANDLE_VALUE) { error handling here }
// Figure out how big it is.
LARGE_INTEGER li_size;
if (!GetFileSizeEx(hfile, &li_size)) { error handling here }
// TODO: On a 32-bit build, this won't be able to handle huge files,
// so check that here.
std::size_t size = li_size.QuadPart;
// Create a buffer to store the data, being careful to round up to a
// multiple of sizeof(int). [TODO: Use a std::vector instead.]
int* Data = new int[(size + sizeof(int) - 1) / sizeof(int)];
// Load the data.
const DWORD BytesToRead = static_cast<DWORD>(size);
DWORD BytesRead = 0;
if (!ReadFile(hfile, Data, &BytesRead, nullptr) || BytesRead < BytesToRead) {
error handling here
}
// Close the file
CloseHandle(hfile);
int* Data = new int[size/sizeof(int)];
Why are you doing this? You're dividing the size by 4. You don't want to do this. It should just be int* Data = new int[size]
Also, it should be std::ifstream f("filename.bin", std::ios::binary);

Can open small ASCII file, but not large binary file?

I am using the below code to open a large (5.1GB) binary file in MSVC on Windows. The machine has plenty of RAM. The problem is the length is being retrieved as zero. However, when I change the file_path to a smaller ASCII file the code works fine.
Why can I not load the large binary file? I prefer this approach as I wanted a pointer to the file contents.
FILE * pFile;
uint64_t lSize;
char * buffer;
size_t result;
pFile = fopen(file_path, "rb");
if (pFile == NULL) {
fputs("File error", stderr); exit(1);
}
// obtain file size:
fseek(pFile, 0, SEEK_END);
lSize = ftell(pFile); // RETURNS ZERO
rewind(pFile);
// allocate memory to contain the whole file:
buffer = (char*)malloc(sizeof(char)*lSize);
if (buffer == NULL) {
fputs("Memory error", stderr); exit(2);
}
// copy the file into the buffer:
result = fread(buffer, 1, lSize, pFile); // RETURNS ZERO TOO
if (result != lSize) { // THIS FAILS
fputs("Reading error", stderr); exit(3);
}
/* the whole file is now loaded in the memory buffer. */
its not the file permissions or anything, they are fine.
If you allocate 5,1 GB, you'd better be sure that you've compiled your code in 64 bits and run it on a 64 bits windows version. Ohterwhise, the memory address space is limited to maxi 3 GB on a 32 bits Windows and 4 GB with 32 bits code on a 64 bits Windows.
By the way, ftell() returns a signed long. You have to check that there is no error here (such as an overflow if the OS allows larger file sizes), so that the value is not -1.
Edit:
Note that with MSVC, long will currently be a 32 bits number even if compiled for 64 bits. This means that ftell() will give you a meaningful result if the filesize if below 2GB (because fo the sign).
You could use non portable OS specific WinAPI function GetFileSizeEx() to get the size of large files in a signed 64 bit number.
malloc() takes a size_t which is an unsigned 64 bit number. So on this side you're safe.
An alternative would be to use file mapping.
Second edit
I looked at your edits about value received for size, which differ of what i expected. I could reproduce the error on my system, and got a size that was not null, but it was a number much much large than the file.
Looking at this CERT security recommendation, it appeared that the guarantees offered by the standard for fseek() in combination with SEEK_END are unsufficient and make this a very unsafe approach.
So let's repeast: the saffest way to get the size would be to use the native OS function i.e. GetFileSizeEx() on Windows. There's a workaround on a 64 bit windows: use _fseeki64() and _ftelli64():
...
if (_fseeki64(pFile, 0, SEEK_END)) {
fputs("File seek error", stderr);
return (1);
}
lSize = _ftelli64(pFile); // RETURNS EXACT SIZE
...
This worked very well (the initial problem seemed to be linked with the return type which was not large enough). However keep in mind that it's a workaround, and I fear that there could be other error conditions that could lead to the vulnerability reported by CERT.
The data type long is too small to represent you file size. Use the stat() method (or the Windows-specific alternative GetFileAttributes) to read the file size.

Storing an image file into a buffer (gif,jpeg etc).

I'm trying to load an image file into a buffer in order to send it through a scket. The problem that I'm having is that the program creates a buffer with a valid size but it does not copy the whole file into the buffer. My code is as follow
//imgload.cpp
#include <iostream>
#include <stdlib.h>
#include <stdio.h>
using namespace std;
int main(int argc,char *argv){
FILE *f = NULL;
char filename[80];
char *buffer = NULL;
long file_bytes = 0;
char c = '\0';
int i = 0;
printf("-Enter a file to open:");
gets(filename);
f = fopen(filename,"rb");
if (f == NULL){
printf("\nError opening file.\n");
}else{
fseek(f,0,SEEK_END);
file_bytes = ftell(f);
fseek(f,0,SEEK_SET);
buffer = new char[file_bytes+10];
}
if (buffer != NULL){
printf("-%d + 10 bytes allocated\n",file_bytes);
}else{
printf("-Could not allocate memory\n");
// Call exit?.
}
while (c != EOF){
c = fgetc(f);
buffer[i] = c;
i++;
}
c = '\0';
buffer[i-1] = '\0'; // helps remove randome characters in buffer when copying is finished..
i = 0;
printf("buffer size is now: %d\n",strlen(buffer));
//release buffer to os and cleanup....
return 0;
}
> output
c:\Users\Desktop>imgload
-Enter a file to open:img.gif
-3491 + 10 bytes allocated
buffer size is now: 9
c:\Users\Desktop>imgload
-Enter a file to open:img2.gif
-1261 + 10 bytes allocated
buffer size is now: 7
From the output I can see that it's allocating the correct size for each image 3491 and 1261 bytes (i doubled checked the file sizes through windows and the sizes being allocated are correct) but the buffer sizes after supposedly copying is 9 and 7 bytes long. Why is it not copying the entire data?.
You are wrong. Image is binary data, nor string data. So there are two errors:
1) You can't check end of file with EOF constant. Because EOF is often defined as 0xFF and it is valid byte in binary file. So use feof() function to check for end of file. Or also you may check current position in file with maximal possible (you got it before with ftell()).
2) As file is binary it may contain \0 in middle. So you can't use string function to work with such data.
Also I see that you use C++ language. Tell me please why you use classical C syntax for file working? I think that using C++ features such as file streams, containers and iterators will simplify your program.
P.S. And I want to say that you program will have problems with really big files. Who knows maybe you will try to work with them. If 'yes', rewrite ftell/fseek functions to their int64 (long long int) equivalents. Also you'll need to fix array counter. Another good idea is to read file by blocks. Reading byte by byte is dramatically slower.
All this is unneeded and actually makes no sense:
c = '\0';
buffer[i-1] = '\0';
i = 0;
printf("buffer size is now: %d\n",strlen(buffer));
Don't use strlen for binary data. strlen stops at the first NUL (\0) byte. A binary file may contain many such bytes, so NUL can't be used.
-3491 + 10 bytes allocated /* There are 3491 bytes in the file. */
buffer size is now: 9 /* The first byte with the value 0. */
In conclusion, drop that part. You already have the size of the file.
You are reading a binary file like a text file. You can't check for EOF as this could be anywhere in the binary file.

Fastest way to create large file in c++?

Create a flat text file in c++ around 50 - 100 MB
with the content 'Added first line' should be inserted in to the file for 4 million times
using old style file io
fopen the file for write.
fseek to the desired file size - 1.
fwrite a single byte
fclose the file
The fastest way to create a file of a certain size is to simply create a zero-length file using creat() or open() and then change the size using chsize(). This will simply allocate blocks on the disk for the file, the contents will be whatever happened to be in those blocks. It's very fast since no buffer writing needs to take place.
Not sure I understand the question. Do you want to ensure that every character in the file is a printable ASCII character? If so, what about this? Fills the file with "abcdefghabc...."
#include <stdio.h>
int main ()
{
const int FILE_SiZE = 50000; //size in KB
const int BUFFER_SIZE = 1024;
char buffer [BUFFER_SIZE + 1];
int i;
for(i = 0; i < BUFFER_SIZE; i++)
buffer[i] = (char)(i%8 + 'a');
buffer[BUFFER_SIZE] = '\0';
FILE *pFile = fopen ("somefile.txt", "w");
for (i = 0; i < FILE_SIZE; i++)
fprintf(pFile, buffer);
fclose(pFile);
return 0;
}
You haven't mentioned the OS but I'll assume creat/open/close/write are available.
For truly efficient writing and assuming, say, a 4k page and disk block size and a repeated string:
open the file.
allocate 4k * number of chars in your repeated string, ideally aligned to a page boundary.
print repeated string into the memory 4k times, filling the blocks precisely.
Use write() to write out the blocks to disk as many times as necessary. You may wish to write a partial piece for the last block to get the size to come out right.
close the file.
This bypasses the buffering of fopen() and friends, which is good and bad: their buffering means that they're nice and fast, but they are still not going to be as efficient as this, which has no overhead of working with the buffer.
This can easily be written in C++ or C, but does assume that you're going to use POSIX calls rather than iostream or stdio for efficiency's sake, so it's outside the core library specification.
I faced the same problem, creating a ~500MB file on Windows very fast.
The larger buffer you pass to fwrite() the fastest you'll be.
int i;
FILE *fp;
fp = fopen(fname,"wb");
if (fp != NULL) {
// create big block's data
uint8_t b[278528]; // some big chunk size
for( i = 0; i < sizeof(b); i++ ) // custom initialization if != 0x00
{
b[i] = 0xFF;
}
// write all blocks to file
for( i = 0; i < TOT_BLOCKS; i++ )
fwrite(&b, sizeof(b), 1, fp);
fclose (fp);
}
Now at least on my Win7, MinGW, creates file almost instantly.
Compared to fwrite() 1 byte at time, that will complete in 10 Secs.
Passing 4k buffer will complete in 2 Secs.
Fastest way to create large file in c++?
Ok. I assume fastest way means the one that takes the smallest run time.
Create a flat text file in c++ around 50 - 100 MB with the content 'Added first line' should be inserted in to the file for 4 million times.
preallocate the file using old style file io
fopen the file for write.
fseek to the desired file size - 1.
fwrite a single byte
fclose the file
create a string containing the "Added first line\n" a thousand times.
find it's length.
preallocate the file using old style file io
fopen the file for write.
fseek to the the string length * 4000
fwrite a single byte
fclose the file
open the file for read/write
loop 4000 times,
writing the string to the file.
close the file.
That's my best guess.
I'm sure there are a lot of ways to do it.