I am reading in binary data from a file:
char* buffIn = new char[8];
ifstream inFile(path, ifstream::binary);
inFile.read(buffIn, 8);
I then want to convert the char* read in (as binary) to an unsigned long but I am having problems - I am not quite sure what is going on, but for instance 0x00000000000ACD gets interpreted as 0xFFFFFFFFFFFFCD - I suspect all the 0x00 bytes are causing some sort of problem when converting from char* to unsigned long...
unsigned long number = *(buffIn);
How do I do this properly?
Since buffIn is of type char pointer, when you do *(buffIn) you are just grabbing one character. You have to reinterpret the memory address as an unsigned long pointer and then dereference it.
unsigned long number = *((unsigned long*)buffIn);
In addition to recasting the char[8] (which will only read the the first unsigned long - which is 32-bits in length), you can also use some simple bit-wise operations
unsigned long value = (((unsigned long)buffin[0]) << 24) | (((unsigned long)buffin[1]) << 16) | (((unsigned long)buffin[2]) << 8) | (unsigned long)buffin[3];
Try something like
unsigned long* buffInL = new unsigned long[2];
char* buffIn=(char*)buffInL;
ifstream inFile(path, ifstream::binary);
inFile.read(buffIn, 8);
Unlike other types, char* is allowed to alias.
Related
I'm trying to read an unsigned long number from a binary file.
i'm doing this in this way:
infile.open("file.bin", std::ios::in | std::ios::binary);
char* U=new char[sizeof(unsigned long)];
unsigned long out=0;
infile.read(U, sizeof(unsigned long));
out=static_cast<unsigned long>(*U);
delete[] U;
U=NULL;
infile.close();
but result is not correct.
My data is 6A F2 6B 58 00 00 00 00 witch should be read as 1483469418 but out is 106 in my code which is just the first byte of data
What is the problem?
how should i correctly read an unsigned long from file?
That is because you are casting a dereferenced value. I.e. only a char not full 4 bytes. *U is 106.
You can read the data in without the intermediate buffer:
infile.read(reinterpret_cast<char*>(&out), sizeof out);
The difference is that here you are reinterpreting the pointer, not the value under it.
If you still want to use the buffer, it should be *reinterpret_cast<unsigned long*>(U);, this also reinterprets the pointer 1st, and then dereferences it. The key is to dereference a pointer of proper type. The type of pointer determines how many bytes are used for the value.
out=static_cast(U); should be out=(unsigned long *)(U);
It can be much simpler:
infile.open("file.bin", std::ios::in | std::ios::binary);
unsigned long out=0;
infile.read((char *)&out, sizeof(out));
infile.close();
Try out=*reinterpret_cast<unsigned long *>(U);
You need to know whether the file (not the program) is big endian or little endian. Then read the bytes with fgetc() and reconsitute the number
so
unsigned long read32be(FILE *fp)
{
unsigned long ch0, ch1, ch2 ch3;
ch0 = fgetc(fp);
ch1 = fgetc(fp);
ch2 = fgetc(fp);
ch3 = fgetc(fp);
return (unsigned long) (ch0 << 24) | (ch1 << 16) | (ch2 << 8) | ch3
}
Now it will work regardless of whether longs is 32 bits or 64, big_endian or little endian. If the file is little endian, swap the order of the fgetc()s.
Reading binary files portably is surprisingly tricky. I've put some code on github
https://github.com/MalcolmMcLean/ieee754
I need to read binary data to buffer, but in the fstreams I have read function reading data into char buffer, so my question is:
How to transport/cast binary data into unsigned char buffer and is it best solution in this case?
Example
char data[54];
unsigned char uData[54];
fstream file(someFilename,ios::in | ios::binary);
file.read(data,54);
// There to transport **char** data into **unsigned char** data (?)
// How to?
Just read it into unsigned char data in the first place
unsigned char uData[54];
fstream file(someFilename,ios::in | ios::binary);
file.read((char*)uData, 54);
The cast is necessary but harmless.
You don't need to declare the extra array uData. The data array can simply be cast to unsigned:
unsigned char* uData = reinterpret_cast<unsigned char*>(data);
When accessing uData you instruct the compiler to interpret the data different, for example data[3] == -1, means uData[3] == 255
You could just use
std::copy(data, data + n, uData);
where n is the result returned from file.read(data, 54). I think, specifically for char* and unsigned char* you can also portably use
std::streamsize n = file.read(reinterpret_cast<char*>(uData));
I have a char array and it holds value 0x4010, i want this value into an unsigned short varaible.
I did this by using atoi but getting short value as 0
unsigned short cvtValue = (unsigned short) atoi(aclDta);
character for 0x10 is DEL, i hope it is because of this.
Decimal is 6416
You don't need to convert the data with atoi, just cast it:
unsigned short cvtValue = *(unsigned short *)aclDta;
What you're asking doesn't make sense. 0x4010 in ascii is '#' followed by a 'data link escape'.
atoi, strtol etc are all about parsing ascii strings containing numbers - #\DLE isn't a number.
What you really seem to want is to treat the 0x4010 bytes as a single short.
here's a cheap way:
cvtValue |= ((short)aclData[0]) << 8;
cvtValue |= ((short)aclData[1]);
I'd comment, but apparently as a new user I can't? Anyway, antiduh's answer is more correct if you might ever port your application to platforms having different endienness.
char *str = "01";
unsigned short val = *(unsigned short *)str;
On little endien systems val == 0x3130. On big endien systems val == 0x3031.
I have a binary file in big-endian format from which I am retrieving 2-bit and 4-bit integer data. The machine I'm running on is little-endian.
Does anyone have any suggestions or a best-practice on pulling integer data from a known format binary and switching endianness on the fly? I'm not sure that my current solution is even correct:
int myInt;
ifstream dataFile(dataFileLocation, ios::in | ios::binary);
dataFile.seekg(99, ios::beg); //Pull data starting at byte 100;
//For 4-byte value:
char chunk[4];
dataFile.read(chunk, 4);
myInt = (int)(chunk[0] << 24 | chunk[1] << 16 | chunk[2] << 8 | chunk[3]);
//For 2-byte value:
char chunk[2];
dataFile.read(chunk, 4);
myInt = (int)(chunk[0] << 8 | chunk[1]);
This seems to work fine for 2-byte data but gives what I believe are incorrect values on 4-byte data. I've read about htonl() but from what I've read that's not a smart way to go for flexibility.
Use unsigned integral types only and you'll be fine:
unsigned char buf[4];
infile.read(reinterpret_cast<char*>(buf), 4);
unsigned int b4 = (buf[0] << 24) + ... + (buf[3]);
unsigned int b2 = (buf[0] << 8) + (buf[1]);
Shifting involves type promotions, and indefinite sign extensions (given the implementation-defined nature of char). Basically you always want everything to be unsigned when manipulating bits.
I have a process that listens to an UDP multi-cast broadcast and reads in the data as a unsigned char*.
I have a specification that indicates fields within this unsigned char*.
Fields are defined in the specification with a type and size.
Types are: uInt32, uInt64, unsigned int, and single byte string.
For the single byte string I can merely access the offset of the field in the unsigned char* and cast to a char, such as:
char character = (char)(data[1]);
Single byte uint32 i've been doing the following, which also seems to work:
uint32_t integer = (uint32_t)(data[20]);
However, for multiple byte conversions I seem to be stuck.
How would I convert several bytes in a row (substring of data) to its corresponding datatype?
Also, is it safe to wrap data in a string (for use of substring functionality)? I am worried about losing information, since I'd have to cast unsigned char* to char*, like:
std::string wrapper((char*)(data),length); //Is this safe?
I tried something like this:
std::string wrapper((char*)(data),length); //Is this safe?
uint32_t integer = (uint32_t)(wrapper.substr(20,4).c_str()); //4 byte int
But it doesn't work.
Thoughts?
Update
I've tried the suggest bit shift:
void function(const unsigned char* data, size_t data_len)
{
//From specifiction: Field type: uInt32 Byte Length: 4
//All integer fields are big endian.
uint32_t integer = (data[0] << 24) | (data[1] << 16) | (data[2] << 8) | (data[3]);
}
This sadly gives me garbage (same number for every call --from a callback).
I think you should be very explicit, and not just do "clever" tricks with casts and pointers. Instead, write a function like this:
uint32_t read_uint32_t(unsigned char **data)
{
const unsigned char *get = *data;
*data += 4;
return (get[0] << 24) | (get[1] << 16) | (get[2] << 8) | get[3];
}
This extracts a single uint32_t value from a buffer of unsigned char, and increases the buffer pointer to point at the next byte of data in the buffer.
This assumes big-endian data, you need to have a well-defined idea of the buffer's endian-mode in order to interpret it.
Depends on the byte ordering of the protocol, for big-endian or so called network byte order do:
uint32_t i = data[0] << 24 | data[1] << 16 | data[2] << 8 | data[3];
Without commenting on whether it's a good idea or not, the reason why it doesn't work for you is that the result of wrapper.substring(20,4).c_str() is (uint32_t *), not (uint32_t). So if you do:
uint32_t * integer = (uint32_t *)(wrapper.substr(20,4).c_str(); it should work.
uint32_t integer = ntohl(*reinterpret_cast<const uint32_t*>(data + 20));
or (handles alignment issues):
uint32_t integer;
memcpy(&integer, data+20, sizeof integer);
integer = ntohl(integer);
The pointer way:
uint32_t n = *(uint32_t*)&data[20];
You will run into problems on different endian architectures though. The solution with bit shifts is better and consistent.
std::string wrapper((char*)(data),length); //Is this safe?
This should be safe since you specified the length of the data.
On the other hand if you did this:
std::string wrapper((char*)data);
The string length would be determined wherever the first 0 byte occurs, and you will more than likely chop off some data.