I have the following function (so far):
void read_binary_file(std::istream is,
ByteArray arr)
{
int length = is.tellg();
char *buffer = new char[length];
is.read(buffer, length);
// What to do next?
// The goal is to place istream buffer in my `ByteArray` class `values`class,
// ByteArray - an array of `float`, each item should be 4 bytes from the buffer
}
My goal is to place each 4 bytes from the buffer inside my ByteArray->values class. Each item should contain 4 bytes from the buffer.
ByteArray definition:
class ByteArray
{
....
float *values;
}
Limitations: I don't want to use stl/ vector classes.
I couldn't find an example with my current limitations.
Any idea how I can do that?
If I understand correctly, you want to create a ByteArray object and copy bytes from buffer to ByteArray::values[] as floats. Assuming that the file is opened in binary mode & contain floats dumped in correct format+endianness, and total data in file is multiple of sizeof(float):
class ByteArray
{
private:
float* values;
public:
void set(char* buffer, int len)
{
values = new float[len/4];
for(int itr =0; itr < len/4; itr++)
{
values[itr] = *(float*)(buffer+itr*4);
}
}
};
...
arr.set(buffer, length);
Note that i) smarter codes are possible but I kept it as simple as possible for your understanding. ii) Ulrich is right, you should pass istream by reference (as well as ByteArray for most practical purposes):
void read_binary_file(std::istream& is,
ByteArray& arr)
...
If you want to use istream to send bytes byte by byte you can say
arr.values=(float*)buffer;
or
arr.values=new float[length/4];
memcpy(arr.values,buffer,length);
delete[] buffer;
It works until you want to send a float which contains a eof byte by accident. 2 is a float like that, so it isn't uncommon. Then you can't do anything as istream stops at that byte. So I recommend not to send floats byte by byte in stringteams. Send them an other way eg in hexa. (hat way you don't loose precision).
What generated the file you want to read?
Related
I am wondering if it is possible to convert vector of pairs into a byte array.
Here's a small example of creating the vector of pairs:
int main(int argc, char *argv[])
{
PBYTE FileData, FileData2, FileData3;
DWORD FileSize, FileSize2, FileSize3;
/* Here I read 3 files + their sizes and fill the above variables. */
//Here I create the vector of std::pairs.
std::vector<std::pair<PBYTE, DWORD>> DataVector
{
{ FileData, FileSize }, //Pair contains always file data + file size.
{ FileData2, FileSize2 },
{ FileData3, FileSize3 }
};
std::cin.ignore(2);
return 0;
}
Is it possible to convert this vector into a byte array (for compressing, and writing to disk, etc)?
Here is what I tried, but I didn't get even the size correctly:
PVOID DataVectorArr = NULL;
DWORD DataVectorArrSize = DataVector.size() * sizeof DataVector[0];
if ((DataVectorArr = malloc(DataVectorArrSize)) != NULL)
{
memcpy(DataVectorArr, &DataVector[0], DataVectorArrSize);
}
std::cout << DataVectorArrSize;
//... Here I tried to write the DataVectorArr to disk, which obviously fails because the size isn't correct. I am not also sure if the DataVectorArr contains the DataVector now.
if (DataVectorArr != NULL) delete DataVectorArr;
Enough code. Is is it even possible, or am I doing it wrong? If I am doing it wrong, what would be the solution?
Regards, Okkaaj
Edit: If it's unclear what I am trying to do, read the following (which I commented earlier):
Yes, I am trying to cast the vector of pairs to a PCHAR or PBYTE - so I can store it to disk using WriteFile. After it is stored, I can read it from disk as byte array, and parse back to vector of pairs. Is this possible? I got the idea from converting / casting struct to a byte array and back(read more from here: Converting struct to byte and back to struct) but I am not sure if this is possible with std::vector instead of structures.
Get rid of the malloc and make use of RAII for this:
std::vector<BYTE> bytes;
for (auto const& x : DataVector)
bytes.insert(bytes.end(), x.first, x.first+x.second);
// bytes now contains all images buttressed end-to-end.
std::cout << bytes.size() << '\n';
To avoid potential resize slow-lanes, you can enumerate the size calculation first, then .reserve() the space ahead of time:
std::size_t total_len = 0;
for (auto const& x : DataVector)
total_len += x.second;
std::vector<BYTE> bytes;
bytes.reserve(total_len);
for (auto const& x : DataVector)
bytes.insert(bytes.end(), x.first, x.first+x.second);
// bytes now contains all images buttressed end-to-end.
std::cout << bytes.size() << '\n';
But if all you want to do is dump these contiguously to disk, then why not simply:
std::ofstream outp("outfile.bin", std::ios::out|std::ios::binary);
for (auto const& x : DataVector)
outp.write(static_cast<const char*>(x.first), x.second);
outp.close();
skipping the middle man entirely.
And honestly, unless there is a good reason to do otherwise, it is highly likely your DataVector would be better off as simply a std::vector< std::vector<BYTE> > in the first place.
Update
If recovery is needed, you can't just do this as above. The minimal artifact that is missing is the description of the data itself. In this case the description is the actual length of each pair segment. To accomplish that the length must be stored along with the data. Doing that is trivial unless you also need it portable to platform-independence.
If that last sentence made you raise your brow, consider the problems with doing something as simple as this:
std::ofstream outp("outfile.bin", std::ios::out|std::ios::binary);
for (auto const& x : DataVector)
{
uint64_t len = static_cast<uint64_t>(x.first);
outp.write(reinterpret_cast<const char *>(&len), sizeof(len));
outp.write(static_cast<const char*>(x.first), x.second);
}
outp.close();
Well, now you can read each file by doing this:
Read a uint64_t to obtain the byte length of the data to follow
Read the data of that length
But this has inherent problems. It isn't portable at all. The endian-representation of the reader's platform had better match that of the writer, or this is utter fail. To accommodate this limitation the length preamble must be written in a platform-independent manner, which is tedious and a foundational reason why serialization libraries and their protocols exit in the first place.
If you haven't second-guessed what you're doing and how you're doing it by this point, you may want to read this again.
I'm writing a resource file which I want to insert a bunch of data from various common files such as .JPG, .BMP (for example) and I want it to be in binary.
I'm going to code something to retrieve these data later on organized by index, and this is what I got so far:
float randomValue = 23.14f;
ofstream fileWriter;
fileWriter.open("myFile.dat", ios::binary);
fileWriter.write((char*)&randomValue, sizeof(randomValue));
fileWriter.close();
//With this my .dat file, when opened in notepad has "B!¹A" in it
float retrieveValue = 0.0f;
ifstream fileReader;
fileReader.open("myFile.dat", ios::binary);
fileReader.read((char*)&retrieveValue, sizeof(retrieveValue));
fileReader.close();
cout << retrieveValue << endl; //This gives me exactly the 23.14 I wanted, perfect!
While this works nicely, I'd like to understand what exactly is happening there.
I'm converting the address of randomValue to char*, and writing the values in this address to the file?
I'm curious also because I need to do this for an array, and I can't do this:
int* myArray = new int[10];
//fill myArray values with random stuff
fileWriter.open("myFile.dat", ios::binary);
fileWriter.write((char*)&myArray, sizeof(myArray));
fileWriter.close();
From what I understand, this would just write the first address' value in the file, not all the array. So, for testing, I'm trying to simply convert a variable to a char* which I would write to a file, and convert back to the variable to see if I'm retrieving the values correctly, so I'm with this:
int* intArray = new int[10];
for(int i = 0; i < 10; i++)
{
cout << &intArray[i]; //the address of each number in my array
cout << intArray[i]; //it's value
cout << reinterpret_cast<char*>(&intArray[i]); //the char* value of each one
}
But for some reason I don't know, my computer "beeps" when I run this code. During the array, I'm also saving these to a char* and trying to convert back to int, but I'm not getting the results expected, I'm getting some really long values.
Something like:
float randomValue = 23.14f;
char* charValue = reinterpret_cast<char*>(&randomValue);
//charValue contains "B!¹A" plus a bunch of other (un-initiallized values?) characters, so I'm guessing the value is correct
//Now I'm here
I want to convert charValue back to randomValue, how can I do it?
edit: There's valuable information in the answers below, but they don't solve my (original) problem. I was testing these type of conversions because I'm doing a code that I will pick a bunch of resource files such as BMP, JPG, MP3, and save them in a single .DAT file organized by some criteria I still haven't fully figured out.
Later, I am going to use this resource file to read from and load these contents into a program (game) I'm coding.
The criteria I am still thinking but I was wondering if it's possible to do something like this:
//In my ResourceFile.DAT
[4 bytes = objectID][3 bytes = objectType (WAV, MP3, JPG, BMP, etc)][4 bytes = objectLength][objectLength bytes = actual objectData]
//repeating this until end of file
And then in the code that reads the resource file, I want to do something like this (untested):
ifstream fileReader;
fileReader.open("myFile.DAT", ios::binary);
//file check stuff
while(!fileReader.eof())
{
//Here I'll load
int objectID = 0;
fileReader((char*)&objectID, 4); //read 4 bytes to fill objectID
char objectType[3];
fileReader(&objectType, 3); //read the type so I know which parser use
int objectLength = 0;
fileReader((char*)&objectLength, 4); //get the length of the object data
char* objectData = new char[objectLength];
fileReader(objectData, objectLength); //fill objectData with the data
//Here I'll use a parser to fill classes depending on the type etc, and move on to the next obj
}
Currently my code is working with the original files (BMP, WAV, etc) and filling them into classes, and I want to know how I can save the data from these files into a binary data file.
For example, my class that manages BMP data has this:
class FileBMP
{
public:
int imageWidth;
int imageHeight;
int* imageData;
}
When I load it, I call:
void FileBMP::Load(int iwidth, int iheight)
{
int imageTotalSize = iwidth * iheight * 4;
imageData = new int[imageTotalSize]; //This will give me 4 times the amount of pixels in the image
int cPixel = 0;
while(cPixel < imageTotalSize)
{
imageData[cPixel] = 0; //R value
imageData[cPixel + 1] = 0; //G value
imageData[cPixel + 2] = 0; //B value
imageData[cPixel + 3] = 0; //A value
cPixel += 4;
}
}
So I have this single dimension array containing values in the format of [RGBA] per pixel, which I am using later on for drawing on screen.
I want to be able to save just this array in the binary data format that I am planning that I stated above, and then read it and fill this array.
I think it's asking too much for a code like this, so I'd like to understand what I need to know to save these values into a binary file and then read back to fill it.
Sorry for the long post!
edit2: I solved my problem by making the first edit... thanks for the valuable info, I also got to know what I wanted to!
By using the & operator, you're getting a pointer to the contents of the variable (think of it as just a memory address).
float a = 123.45f;
float* p = &a; // now p points to a, i.e. has the memory address to a's contents.
char* c = (char*)&a; // c points to the same memory location, but the code says to treat the contents as char instead of float.
When you gave the (char*)&randomValue for write(), you simply told "take this memory address having char data and write sizeof(randomValue) chars from there". You're not writing the address value itself, but the contents from that location of memory ("raw binary data").
cout << reinterpret_cast<char*>(&intArray[i]); //the char* value of each one
Here you're expected to give char* type data, terminated with a null char (zero). However, you're providing the raw bytes of the float value instead. Your program might crash here, as cout will input chars until it finds the terminator char -- which it might not find anytime soon.
float randomValue = 23.14f;
char* charValue = reinterpret_cast<char*>(&randomValue);
float back = *(float*)charValue;
Edit: to save binary data, you simply need to provide the data and write() it. Do not use << operator overloads with ofstream/cout. For example:
int values[3] = { 5, 6, 7 };
struct AnyData
{
float a;
int b;
} data;
cout.write((char*)&values, sizeof(int) * 3); // the other two values follow the first one, you can write them all at once.
cout.write((char*)&data, sizeof(data)); // you can also save structs that do not have pointers.
In case you're going to write structs, have a look at #pragma pack compiler directive. Compilers will align (use padding) variable to certain size (int), which means that the following struct actually might require 8 bytes:
#pragma pack (push, 1)
struct CouldBeLongerThanYouThink
{
char a;
char b;
};
#pragma pack (pop)
Also, do not write pointer values itself (if there are pointer members in a struct), because the memory addresses will not point to any meaningful data once read back from a file. Always write the data itself, not pointer values.
What's happening is that you're copying the internal
representation of your data to a file, and then copying it back
into memory, This works as long as the program doing the
writing was compiled with the same version of the compiler,
using the same options. Otherwise, it might or it might not
work, depending on any number of things beyond your control.
It's not clear to me what you're trying to do, but formats like
.jpg and .bmp normally specify the format they want the
different types to have, and you have to respect that format.
It is unclear what you really want to do, so I cannot recommend a way of solving your real problem. But I would not be surprised if running the program actually caused beeps or any other strange behavior in your program.
int* intArray = new int[10];
for(int i = 0; i < 10; i++)
{
cout << reinterpret_cast<char*>(&intArray[i]);
}
The memory returned by new above is uninitialized, but you are trying to print it as if it was a null terminated string. That uninitialized memory could have the bell character (that causes beeps when printed to the terminal) or any other values, including that it might potentially not have a null termination and the insertion operator into the stream will overrun the buffer until it either finds a null or your program crashes accessing invalid memory.
There are other incorrect assumptions in your code, like for example given int *p = new int[10]; the expression sizeof(p) will be the size of a pointer in your architecture, not 10 times the size of an integer.
I've finally figured out how to write some specifically formatted information to a binary file, but now my problem is reading it back and building it back the way it originally was.
Here is my function to write the data:
void save_disk(disk aDisk)
{
ofstream myfile("disk01", ios::out | ios::binary);
int32_t entries;
entries = (int32_t) aDisk.current_file.size();
char buffer[10];
sprintf(buffer, "%d",entries);
myfile.write(buffer, sizeof(int32_t));
std::for_each(aDisk.current_file.begin(), aDisk.current_file.end(), [&] (const file_node& aFile)
{
myfile.write(aFile.name, MAX_FILE_NAME);
myfile.write(aFile.data, BLOCK_SIZE - MAX_FILE_NAME);
});
}
and my structure that it originally was created with and what I want to load it back into is composed as follows.
struct file_node
{
char name[MAX_FILE_NAME];
char data[BLOCK_SIZE - MAX_FILE_NAME];
file_node(){};
};
struct disk
{
vector<file_node> current_file;
};
I don't really know how to read it back in so that it is arranged the same way, but here is my pathetic attempt anyway (I just tried to reverse what I did for saving):
void load_disk(disk aDisk)
{
ifstream myFile("disk01", ios::in | ios::binary);
char buffer[10];
myFile.read(buffer, sizeof(int32_t));
std::for_each(aDisk.current_file.begin(), aDisk.current_file.end(), [&] (file_node& aFile)
{
myFile.read(aFile.name, MAX_FILE_NAME);
myFile.read(aFile.data, BLOCK_SIZE - MAX_FILE_NAME);
});
}
^^ This is absolutely wrong. ^^
I understand the basic operations of the ifstream, but really all I know how to do with it is read in a file of text, anything more complicated than that I'm kind of lost.
Any suggestions on how I can read this in?
You're very close. You need to write and read the length as binary.
This part of your length-write is wrong:
char buffer[10];
sprintf(buffer, "%d",entries);
myfile.write(buffer, sizeof(int32_t));
It only writes the first four bytes of whatever the length is, but the length is character data from a sprintf() call. You need to write this as a binary-value of entries (the integer):
// writing your entry count.
uint32_t entries = (uint32_t)aDisk.current_file.size();
entries = htonl(entries);
myfile.write((char*)&entries, sizeof(entries));
Then on the read:
// reading the entry count
uint32_t entries = 0;
myFile.read((char*)&entries, sizeof(entries));
entries = ntohl(entries);
// Use this to resize your vector; for_each has places to stuff data now.
aDisk.current_file.resize(entries);
std::for_each(aDisk.current_file.begin(), aDisk.current_file.end(), [&] (file_node& aFile)
{
myFile.read(aFile.name, MAX_FILE_NAME);
myFile.read(aFile.data, BLOCK_SIZE - MAX_FILE_NAME);
});
Or something like that.
Note 1: this does NO error checking nor does it account for portability for potentially different endian-ness on different host machines (a big-endian machine writing the file, a little endian machine reading it). Thats probably ok for your needs, but you should at least be aware of it.
Note 2: Pass your input disk parameter to load_disk() by reference:
void load_disk(disk& aDisk)
EDIT Cleaning file_node content on construction
struct file_node
{
char name[MAX_FILE_NAME];
char data[BLOCK_SIZE - MAX_FILE_NAME];
file_node()
{
memset(name, 0, sizeof(name));
memset(data, 0, sizeof(data));
}
};
If you are using a compliant C++11 compiler:
struct file_node
{
char name[MAX_FILE_NAME];
char data[BLOCK_SIZE - MAX_FILE_NAME];
file_node() : name(), data() {}
};
I'm currently working on a small C++ project where I use a client-server model someone else built. Data gets sent over the network and in my opinion it's in the wrong order. However, that's not something I can change.
Example data stream (simplified):
0x20 0x00 (C++: short with value 32)
0x10 0x35 (C++: short with value 13584)
0x61 0x62 0x63 0x00 (char*: abc)
0x01 (bool: true)
0x00 (bool: false)
I can represent this specific stream as :
struct test {
short sh1;
short sh2;
char abc[4];
bool bool1;
bool bool2;
}
And I can typecast it with test *t = (test*)stream; However, the char* has a variable length. It is, however, always null terminated.
I understand that there's no way of actually casting the stream to a struct, but I was wondering whether there would be a better way than struct test() { test(char* data) { ... }} (convert it via the constructor)
This is called Marshalling or serialization.
What you must do is read the stream one byte at a time (or put all in a buffer and read from that), and as soon as you have enough data for a member in the structure you fill it in.
When it comes to the string, you simply read until you hit the terminating zero, and then allocate memory and copy the string to that buffer and assign it to a pointer in the struct.
Reading strings this way is simplest and most effective if you have of the message in a buffer already, because then you don't need a temporary buffer for the string.
Remember though, that with this scheme you have to manually free the memory containing the string when you are done with the structure.
Just add a member function that takes in the character buffer(function input parameter char *) and populates the test structure by parsing it.
This makes it more clear and readable as well.
If you provide a implicit conversion constructor then you create a menace which will do the conversion when you least expect it.
When reading variable length data from a sequence of bytes,
you shouldn't fit everything into a single structure or variable.
Pointers are also used to store this variable length.
The following suggestion, is not tested:
// data is stored in memory,
// in a different way,
// NOT as sequence of bytes,
// as provided
struct data {
short sh1;
short sh2;
int abclength;
// a pointer, maybe variable in memory !!!
char* abc;
bool bool1;
bool bool2;
};
// reads a single byte
bool readByte(byte* MyByteBuffer)
{
// your reading code goes here,
// character by character, from stream,
// file, pipe, whatever.
// The result should be true if not error,
// false if cannot rea anymore
}
// used for reading several variables,
// with different sizes in bytes
int readBuffer(byte* Buffer, int BufferSize)
{
int RealCount = 0;
byte* p = Buffer;
while (readByte(p) && RealCount <= BufferSize)
{
RealCount++
p++;
}
return RealCount;
}
void read()
{
// real data here:
data Mydata;
byte MyByte = 0;
// long enough, used to read temporally, the variable string
char temp[64000];
// fill buffer for string with null values
memset(temp, '\0', 64000);
int RealCount = 0;
// try read "sh1" field
RealCount = (readBuffer(&(MyData.sh1), sizeof(short)));
if (RealCount == sizeof(short))
{
// try read "sh2" field
RealCount = readBuffer(&(MyData.sh2), sizeof(short));
if (RealCount == sizeof(short))
{
RealCount = readBuffer(temp, 64000);
if (RealCount > 0)
{
// store real bytes count
MyData.abclength = RealCount;
// allocate dynamic memory block for variable length data
MyData.abc = malloc(RealCount);
// copy data from temporal buffer into data structure plus pointer
// arrays in "plain c" or "c++" doesn't require the "&" operator for address:
memcpy(MyData.abc, temp, RealCount);
// comented should be read as:
//memcpy(&MyData.abc, &temp, RealCount);
// continue with rest of data
RealCount = readBuffer(&(MyData.bool1), sizeof(bool));
if (RealCount > 0)
{
// continue with rest of data
RealCount = readBuffer(&(MyData.bool2), sizeof(bool));
}
}
}
}
} // void read()
Cheers.
The pointer to the audio buffer of the XAUDIO_BUFFER structure in XAudio2 is defined as BYTE *pAudioData. When I was using 16-bit Integer PCM, this is what my program looked like:
void buildWaveBuffer(std::vector<unsigned char> &vec)
{
std::string lineString;
int lineInt;
unsigned char lowByte, highByte;
std::ifstream myfile("sineInt16");
if (myfile.is_open())
{
while(myfile.good())
{
std::getline(myfile,lineString,',');
lineInt = atoi(lineString.c_str());
highByte = (lineInt >> 8) & 0x00FF;
lowByte = lineInt & 0x00FF;
vec.push_back(lowByte);
vec.push_back(highByte);
}
myfile.close();
}
}
"sineInt16" being a .csv file. Since the vector is organized sequentially in the memory, I would simply do pAudioData = &vec[0] and it would work. What if I want to change the format of my .csv to float? How do I give a pointer to the first byte in the vector? Should I use another container like a simple array of chars?
How do I give a pointer to the first byte in the vector?
The exact same way, but I'm not sure it will do what you expect. Read the comments to your question.