Does anyone know how to read in a file with raw encoding? So stumped.... I am trying to read in floats or doubles (I think). I have been stuck on this for a few weeks. Thank you!
File that I am trying to read from:
http://www.sci.utah.edu/~gk/DTI-data/gk2/gk2-rcc-mask.raw
Description of raw encoding:
hello://teem.sourceforge.net/nrrd/format.html#encoding (change hello to http to go to page)
- "raw" - The data appears on disk exactly the same as in memory, in terms of byte values and byte ordering. Produced by write() and fwrite(), suitable for read() or fread().
Info of file:
http://www.sci.utah.edu/~gk/DTI-data/gk2/gk2-rcc-mask.nhdr - I think the only things that matter here are the big endian (still trying to understand what that means from google) and raw encoding.
My current approach, uncertain if it's correct:
//Function ripped off from example of c++ ifstream::read reference page
void scantensor(string filename){
ifstream tdata(filename, ifstream::binary); // not sure if I should put ifstream::binary here
// other things I tried
// ifstream tdata(filename) ifstream tdata(filename, ios::in)
if(tdata){
tdata.seekg(0, tdata.end);
int length = tdata.tellg();
tdata.seekg(0, tdata.beg);
char* buffer = new char[length];
tdata.read(buffer, length);
tdata.close();
double* d;
d = (double*) buffer;
} else cerr << "failed" << endl;
}
/* P.S. I attempted to print the first 100 elements of the array.
Then I print 100 other elements at some arbitrary array indices (i.e. 9,900 - 10,000). I actually kept increasing the number of 0's until I ran out of bound at 100,000,000 (I don't think that's how it works lol but I was just playing around to see what happens)
Here's the part that makes me suspicious: so the ifstream different has different constructors like the ones I tried above.
the first 100 values are always the same.
if I use ifstream::binary, then I get some values for the 100 arbitrary printing
if I use the other two options, then I get -6.27744e+066 for all 100 of them
So for now I am going to assume that ifstream::binary is the correct one. The thing is, I am not sure if the file I provided is how binary files actually look like. I am also unsure if these are the actual numbers that I am supposed to read in or just casting gone wrong. I do realize that my casting from char* to double* can be unsafe, and I got that from one of the threads.
*/
I really appreciate it!
Edit 1: Right now the data being read in using the above method is apparently "incorrect" since in paraview the values are:
Dxx,Dxy,Dxz,Dyy,Dyz,Dzz
[0, 1], [-15.4006, 13.2248], [-5.32436, 5.39517], [-5.32915, 5.96026], [-17.87, 19.0954], [-6.02961, 5.24771], [-13.9861, 14.0524]
It's a 3 x 3 symmetric matrix, so 7 distinct values, 7 ranges of values.
The floats that I am currently parsing from the file right now are very large (i.e. -4.68855e-229, -1.32351e+120).
Perhaps somebody knows how to extract the floats from Paraview?
Since you want to work with doubles, I recommend to read the data from file as buffer of doubles:
const long machineMemory = 0x40000000; // 1 GB
FILE* file = fopen("c:\\data.bin", "rb");
if (file)
{
int size = machineMemory / sizeof(double);
if (size > 0)
{
double* data = new double[size];
int read(0);
while (read = fread(data, sizeof(double), size, file))
{
// Process data here (read = number of doubles)
}
delete [] data;
}
fclose(file);
}
Related
I have 640*480 numbers. I need to write them into a file. I will need to read them later. What is the best solution? Numbers are between 0 - 255.
For me the best solution is to write them binary(8 bits). I wrote the numbers into txt file and now it looks like 1011111010111110 ..... So there are no questions where the number starts and ends.
How am I supposed to read them from the file?
Using c++
It's not good idea to write bit values like 1 and 0 to text file. The file size will bigger in 8 times. 1 byte = 8 bits. You have to store bytes, 0-255 - is byte. So your file will have size 640*480 bytes instead of 640*480*8. Every symbol in text file has size of 1 byte minimum. If you want to get bits, use binary operators of programming language that you use. To read bytes much easier. Use binary file for saving your data.
Presumably you have some sort of data structure representing your image, which somewhere inside holds the actual data:
class pixmap
{
public:
// stuff...
private:
std::unique_ptr<std::uint8_t[]> data;
};
So you can add a new constructor which takes a filename and reads bytes from that file:
pixmap(const std::string& filename)
{
constexpr int SIZE = 640 * 480;
// Open an input file stream and set it to throw exceptions:
std::ifstream file;
file.exceptions(std::ios_base::badbit | std::ios_base::failbit);
file.open(filename.c_str());
// Create a unique ptr to hold the data: this will be cleaned up
// automatically if file reading throws
std::unique_ptr<std::uint8_t[]> temp(new std::uint8_t[SIZE]);
// Read SIZE bytes from the file
file.read(reinterpret_cast<char*>(temp.get()), SIZE);
// If we get to here, the read worked, so we move the temp data we've just read
// into where we'd like it
data = std::move(temp); // or std::swap(data, temp) if you prefer
}
I realise I've assumed some implementation details here (you might not be using a std::unique_ptr to store the underlying image data, though you probably should be) but hopefully this is enough to get you started.
You can print the number between 0-255 as the char value in the file.
See the below code. in this example I am printing integer 70 as char.
So this result in print as 'F' on the console.
Similarly you can read it as char and then convert this char to integer.
#include <stdio.h>
int main()
{
int i = 70;
char dig = (char)i;
printf("%c", dig);
return 0;
}
This way you can restrict the file size.
I am taking input from a file in binary mode using C++; I read the data into unsigned ints, process them, and write them to another file. The problem is that sometimes, at the end of the file, there might be a little bit of data left that isn't large enough to fit into an int; in this case, I want to pad the end of the file with 0s and record how much padding was needed, until the data is large enough to fill an unsigned int.
Here is how I am reading from the file:
std::ifstream fin;
fin.open('filename.whatever', std::ios::in | std::ios::binary);
if(fin) {
unsigned int m;
while(fin >> m) {
//processing the data and writing to another file here
}
//TODO: read the remaining data and pad it here prior to processing
} else {
//output to error stream and exit with failure condition
}
The TODO in the code is where I'm having trouble. After the file input finishes and the loop exits, I need to read in the remaining data at the end of the file that was too small to fill an unsigned int. I need to then pad the end of that data with 0's in binary, recording enough about how much padding was done to be able to un-pad the data in the future.
How is this done, and is this already done automatically by C++?
NOTE: I cannot read the data into anything but an unsigned int, as I am processing the data as if it were an unsigned integer for encryption purposes.
EDIT: It was suggested that I simply read what remains into an array of chars. Am I correct in assuming that this will read in ALL remaining data from the file? It is important to note that I want this to work on any file that C++ can open for input and/or output in binary mode. Thanks for pointing out that I failed to include the detail of opening the file in binary mode.
EDIT: The files my code operates on are not created by anything I have written; they could be audio, video, or text. My goal is to make my code format-agnostic, so I can make no assumptions about the amount of data within a file.
EDIT: ok, so based on constructive comments, this is something of the approach I am seeing, documented in comments where the operations would take place:
std::ifstream fin;
fin.open('filename.whatever', std::ios::in | std::ios::binary);
if(fin) {
unsigned int m;
while(fin >> m) {
//processing the data and writing to another file here
}
//1: declare Char array
//2: fill it with what remains in the file
//3: fill the rest of it until it's the same size as an unsigned int
} else {
//output to error stream and exit with failure condition
}
The question, at this point, is this: is this truly format-agnostic? In other words, are bytes used to measure file size as discrete units, or can a file be, say, 11.25 bytes in size? I should know this, I know, but I've got to ask it anyway.
Are bytes used to measure file size as discrete units, or can a file be, say, 11.25 bytes in size?
No data type can be less than a byte, and your file is represented as an array of char meaning each character is one byte. Thus it is impossible to not get a whole number measure in bytes.
Here is step one, two, and three as per your post:
while (fin >> m)
{
// ...
}
std::ostringstream buffer;
buffer << fin.rdbuf();
std::string contents = buffer.str();
// fill with 0s
std::fill(contents.begin(), contents.end(), '0');
I'm writing a resource file which I want to insert a bunch of data from various common files such as .JPG, .BMP (for example) and I want it to be in binary.
I'm going to code something to retrieve these data later on organized by index, and this is what I got so far:
float randomValue = 23.14f;
ofstream fileWriter;
fileWriter.open("myFile.dat", ios::binary);
fileWriter.write((char*)&randomValue, sizeof(randomValue));
fileWriter.close();
//With this my .dat file, when opened in notepad has "B!¹A" in it
float retrieveValue = 0.0f;
ifstream fileReader;
fileReader.open("myFile.dat", ios::binary);
fileReader.read((char*)&retrieveValue, sizeof(retrieveValue));
fileReader.close();
cout << retrieveValue << endl; //This gives me exactly the 23.14 I wanted, perfect!
While this works nicely, I'd like to understand what exactly is happening there.
I'm converting the address of randomValue to char*, and writing the values in this address to the file?
I'm curious also because I need to do this for an array, and I can't do this:
int* myArray = new int[10];
//fill myArray values with random stuff
fileWriter.open("myFile.dat", ios::binary);
fileWriter.write((char*)&myArray, sizeof(myArray));
fileWriter.close();
From what I understand, this would just write the first address' value in the file, not all the array. So, for testing, I'm trying to simply convert a variable to a char* which I would write to a file, and convert back to the variable to see if I'm retrieving the values correctly, so I'm with this:
int* intArray = new int[10];
for(int i = 0; i < 10; i++)
{
cout << &intArray[i]; //the address of each number in my array
cout << intArray[i]; //it's value
cout << reinterpret_cast<char*>(&intArray[i]); //the char* value of each one
}
But for some reason I don't know, my computer "beeps" when I run this code. During the array, I'm also saving these to a char* and trying to convert back to int, but I'm not getting the results expected, I'm getting some really long values.
Something like:
float randomValue = 23.14f;
char* charValue = reinterpret_cast<char*>(&randomValue);
//charValue contains "B!¹A" plus a bunch of other (un-initiallized values?) characters, so I'm guessing the value is correct
//Now I'm here
I want to convert charValue back to randomValue, how can I do it?
edit: There's valuable information in the answers below, but they don't solve my (original) problem. I was testing these type of conversions because I'm doing a code that I will pick a bunch of resource files such as BMP, JPG, MP3, and save them in a single .DAT file organized by some criteria I still haven't fully figured out.
Later, I am going to use this resource file to read from and load these contents into a program (game) I'm coding.
The criteria I am still thinking but I was wondering if it's possible to do something like this:
//In my ResourceFile.DAT
[4 bytes = objectID][3 bytes = objectType (WAV, MP3, JPG, BMP, etc)][4 bytes = objectLength][objectLength bytes = actual objectData]
//repeating this until end of file
And then in the code that reads the resource file, I want to do something like this (untested):
ifstream fileReader;
fileReader.open("myFile.DAT", ios::binary);
//file check stuff
while(!fileReader.eof())
{
//Here I'll load
int objectID = 0;
fileReader((char*)&objectID, 4); //read 4 bytes to fill objectID
char objectType[3];
fileReader(&objectType, 3); //read the type so I know which parser use
int objectLength = 0;
fileReader((char*)&objectLength, 4); //get the length of the object data
char* objectData = new char[objectLength];
fileReader(objectData, objectLength); //fill objectData with the data
//Here I'll use a parser to fill classes depending on the type etc, and move on to the next obj
}
Currently my code is working with the original files (BMP, WAV, etc) and filling them into classes, and I want to know how I can save the data from these files into a binary data file.
For example, my class that manages BMP data has this:
class FileBMP
{
public:
int imageWidth;
int imageHeight;
int* imageData;
}
When I load it, I call:
void FileBMP::Load(int iwidth, int iheight)
{
int imageTotalSize = iwidth * iheight * 4;
imageData = new int[imageTotalSize]; //This will give me 4 times the amount of pixels in the image
int cPixel = 0;
while(cPixel < imageTotalSize)
{
imageData[cPixel] = 0; //R value
imageData[cPixel + 1] = 0; //G value
imageData[cPixel + 2] = 0; //B value
imageData[cPixel + 3] = 0; //A value
cPixel += 4;
}
}
So I have this single dimension array containing values in the format of [RGBA] per pixel, which I am using later on for drawing on screen.
I want to be able to save just this array in the binary data format that I am planning that I stated above, and then read it and fill this array.
I think it's asking too much for a code like this, so I'd like to understand what I need to know to save these values into a binary file and then read back to fill it.
Sorry for the long post!
edit2: I solved my problem by making the first edit... thanks for the valuable info, I also got to know what I wanted to!
By using the & operator, you're getting a pointer to the contents of the variable (think of it as just a memory address).
float a = 123.45f;
float* p = &a; // now p points to a, i.e. has the memory address to a's contents.
char* c = (char*)&a; // c points to the same memory location, but the code says to treat the contents as char instead of float.
When you gave the (char*)&randomValue for write(), you simply told "take this memory address having char data and write sizeof(randomValue) chars from there". You're not writing the address value itself, but the contents from that location of memory ("raw binary data").
cout << reinterpret_cast<char*>(&intArray[i]); //the char* value of each one
Here you're expected to give char* type data, terminated with a null char (zero). However, you're providing the raw bytes of the float value instead. Your program might crash here, as cout will input chars until it finds the terminator char -- which it might not find anytime soon.
float randomValue = 23.14f;
char* charValue = reinterpret_cast<char*>(&randomValue);
float back = *(float*)charValue;
Edit: to save binary data, you simply need to provide the data and write() it. Do not use << operator overloads with ofstream/cout. For example:
int values[3] = { 5, 6, 7 };
struct AnyData
{
float a;
int b;
} data;
cout.write((char*)&values, sizeof(int) * 3); // the other two values follow the first one, you can write them all at once.
cout.write((char*)&data, sizeof(data)); // you can also save structs that do not have pointers.
In case you're going to write structs, have a look at #pragma pack compiler directive. Compilers will align (use padding) variable to certain size (int), which means that the following struct actually might require 8 bytes:
#pragma pack (push, 1)
struct CouldBeLongerThanYouThink
{
char a;
char b;
};
#pragma pack (pop)
Also, do not write pointer values itself (if there are pointer members in a struct), because the memory addresses will not point to any meaningful data once read back from a file. Always write the data itself, not pointer values.
What's happening is that you're copying the internal
representation of your data to a file, and then copying it back
into memory, This works as long as the program doing the
writing was compiled with the same version of the compiler,
using the same options. Otherwise, it might or it might not
work, depending on any number of things beyond your control.
It's not clear to me what you're trying to do, but formats like
.jpg and .bmp normally specify the format they want the
different types to have, and you have to respect that format.
It is unclear what you really want to do, so I cannot recommend a way of solving your real problem. But I would not be surprised if running the program actually caused beeps or any other strange behavior in your program.
int* intArray = new int[10];
for(int i = 0; i < 10; i++)
{
cout << reinterpret_cast<char*>(&intArray[i]);
}
The memory returned by new above is uninitialized, but you are trying to print it as if it was a null terminated string. That uninitialized memory could have the bell character (that causes beeps when printed to the terminal) or any other values, including that it might potentially not have a null termination and the insertion operator into the stream will overrun the buffer until it either finds a null or your program crashes accessing invalid memory.
There are other incorrect assumptions in your code, like for example given int *p = new int[10]; the expression sizeof(p) will be the size of a pointer in your architecture, not 10 times the size of an integer.
I've tried everything I can think of and can't get anything to work. I have a binary file I've written in VB.Net which basically consists of an integer, (in binary of course) that tells me the array size for the following data, then the floats as binary data. The file writes just fine from VB.Net, and I can read it back in through Visual C++ just fine using the following code:
ifstream output("c:\\out.ipv", ios::in | ios::binary);
UInt32 len;
UInt32 *ptr2 = (UInt32*)&len;
output.read((char*)ptr2, 4);
This returns the correct value of 456780, bytes are: 76, 248, 6, 0. When I run the exact same code on my iPad, I get 1043089572. If I use the alternate method below:
NSData *data = [[NSData alloc] initWithContentsOfFile:filePath];
UInt32 num;
const NSRange numV = {0, 4};
[data getBytes:&num range:numV];
This code returns a different value, 124724, and I'm not sure how to read what the exact bytes are that are getting pulled from the file. That's something else I was trying to figure out but couldn't get working. Any idea why the same method that works in Visual C++ won't work on the iPad? I'm really at a loss on this one.
This sounds like an endian issue. You can use any of the functions in <libkern/OSByteOrder.h> to read data in a specified endianness. In your case, you may want to do something like
NSInputStream *istream = [NSInputStream inputStreamWithFileAtPath:filePath];
UInt32 num = 0;
if (istream) {
uint8_t buffer[4];
if ([istream read:buffer maxLength:4] == 4) {
num = OSReadLittleInt32(buffer, 0);
} else {
// there weren't 4 bytes in the file
}
} else {
// the file could not be opened
}
OK, something really strange is going on with my data. I just looked at the raw byte values in both Visual c++ and objective-c, and they don't agree at all. I'm only reading the first four bytes of the file and looking at their values. I'm assuming at this point that I'm not reading them in correctly, but I don't know what I'm missing here. The Visual C++ code I'm using to look at the byte values is below:
ifstream input("c:\\out.ipv", ios::in | ios::binary);
Byte tmp[4];
input.read((char*)&tmp[0], 4);
The values in the tmp array are:
76
248
6
0
If I do the same thing in objective-c:
ifstream input([filePath UTF8String], ios::in | ios::binary);
Byte tmp[4];
input.read((char*)&tmp[0], 4);
I get:
164
72
44
62
What gives? I would have at least expected to get the same byte values. The file containing the four bytes I am having trouble with is here: newout1.ipv
EDIT:
I realized where the 164,72,44,62 byte values are coming from: those are the intial values the Byte array has before I put anything in it. For some reason the line:
input.read((char*)&tmp[0], 4);
isn't doing anything. Any ideas why it's not reading from the file like it should?
FINAL EDIT:
OK, I probably shouldn't post the answer to this since it makes me look really dumb, but I don't want anyone reading these posts to get confused. So the arrays and objects were always returning the same values no matter what, which also happened to be whatever values they had when they were allocated. I had one too many .'s in my filename, so it was trying to read in out..ipv rather than out.ipv. Once I fixed the filename, everything worked exactly how I expected it to. Sorry for the confusion, and thanks for everyones help.
What is an efficient, proper way of reading in a data file with mixed characters? For example, I have a data file that contains a mixture of data loaded from other files, 32-bit integers, characters and strings. Currently, I am using an fstream object, but it gets stopped once it hits an int32 or the end of a string. if i add random data onto the end of the string in the data file, it seems to follow through with the rest of the file. This leads me to believe that the null-termination added onto strings is messing it up. Here's an example of loading in the file:
void main()
{
fstream fin("C://mark.dat", ios::in|ios::binary|ios::ate);
char *mymemory = 0;
int size;
size = 0;
if (fin.is_open())
{
size = static_cast<int>(fin.tellg());
mymemory = new char[static_cast<int>(size+1)];
memset(mymemory, 0, static_cast<int>(size + 1));
fin.seekg(0, ios::beg);
fin.read(mymemory, size);
fin.close();
printf(mymemory);
std::string hithere;
hithere = cin.get();
}
}
Why might this code stop after reading in an integer or a string? How might one get around this? Is this the wrong approach when dealing with these types of files? Should I be using fstream at all?
Have you ever considered that the file reading is working perfectly and it is printf(mymemory) that is stopping at the first null?
Have a look with the debugger and see if I am right.
Also, if you want to print someone else's buffer, use puts(mymemory) or printf("%s", mymemory). Don't accept someone else's input for the format string, it could crash your program.
Try
for (int i = 0; i < size ; ++i)
{
// 0 - pad with 0s
// 2 - to two zeros max
// X - a Hex value with capital A-F (0A, 1B, etc)
printf("%02X ", (int)mymemory[i]);
if (i % 32 == 0)
printf("\n"); //New line every 32 bytes
}
as a way to dump your data file back out as hex.