Reading 6 byte 8-bit integer from binary file - c++

This is what my file looks like:
00 00 00 00 00 34 ....
I have read it already to a unsigned char array using fread, but I don't know, how I can now turn it into a unsigned integer.
The array looks like this:
0, 0, 0, 0, 0, 52

This is how I got it to work:
unsigned char table_index[6];
fread(table_index, 1, 6, file);
unsigned long long tindex = 0;
tindex = (tindex << 8);
tindex = (tindex << 8);
tindex = (tindex << 8) + table_index[0];
tindex = (tindex << 8) + table_index[1];
tindex = (tindex << 8) + table_index[2];
tindex = (tindex << 8) + table_index[3];
tindex = (tindex << 8) + table_index[4];
tindex = (tindex << 8) + table_index[5];

You're starting with a 48 bit value but there's probably no 48 bit integer type on your system. There is probably a 64 bit type though, and it might be a "long long".
Assuming your 6 bytes are ordered most significant first, and understanding that you need to fill out two extra bytes for a long long, you might do something such as:
long long myNumber;
char *ptr = (char *)&myNumber;
*ptr++ = 0; // pad the msb
*ptr++ = 0; // pad the 2nd msb
fread(ptr, 1, 6, fp);
Now you've got a value in myNumber

If the file is filled with 48-bit integers like I am assuming you are talking about, from the char array, you can do this:
char temp[8];
unsigned char *data = //...
unsigned char *data_ptr = data;
vector<unsigned long long> numbers;
size_t sz = // Num of 48-bit numbers
for (size_t i = 0; i < sz; i++, data_ptr += 6)
{
memcpy(temp + 2, data_ptr, 6);
numbers.push_back((unsigned long long)*temp);
}
This algorithm assumes that the numbers are all already encoded properly in the file. It also assumes an endianness that I cannot name off the top of my head.

if you want to interpret 4 bytes of your uchar array as one uint do this :
unsigned char uchararray[totalsize];
unsigned int * uintarray = (unsigned int *)uchararray;
if you want one byte of your uchar array to be transformed to one uint do this :
unsigned char uchararray[totalsize];
unsigned int uintarray[totalsize];
for(int i = 0 ; i < totalsize; i++)
uintarray[i] = (unsigned int)uchararray[i];

Is this what you're talking about?
// long long because it's usually 8 bytes (and there's not usually a 6 byte int type)
vector<unsigned long long> numbers;
fstream infile("testfile.txt");
if (!infile) {
cout << "fail" << endl;
cin.get();
return 0;
}
while (true) {
stringstream numstr;
string tmp;
unsigned long long num;
for (int i = 0; i < 6 && infile >> tmp; ++i)
numstr << hex << tmp;
if (cin.bad())
break;
cout << numstr.str() << endl;
numstr >> num;
numbers.push_back(num);
}
I tested it with the input you gave (00 00 23 51 A4 D2) and the contents of the vector were 592553170.

Related

Convert unsigned char array of characters to int C++

How can I convert an unsigned char array that contains letters into an integer. I have tried this so for but it only converts up to four bytes. I also need a way to convert the integer back into the unsigned char array .
int buffToInteger(char * buffer)
{
int a = static_cast<int>(static_cast<unsigned char>(buffer[0]) << 24 |
static_cast<unsigned char>(buffer[1]) << 16 |
static_cast<unsigned char>(buffer[2]) << 8 |
static_cast<unsigned char>(buffer[3]));
return a;
}
It looks like you're trying to use a for loop, i.e. repeating a task over and over again, for an in-determinant amount of steps.
unsigned int buffToInteger(char * buffer, unsigned int size)
{
// assert(size <= sizeof(int));
unsigned int ret = 0;
int shift = 0;
for( int i = size - 1; i >= 0, i-- ) {
ret |= static_cast<unsigned int>(buffer[i]) << shift;
shift += 8;
}
return ret;
}
What I think you are going for is called a hash -- converting an object to a unique integer. The problem is a hash IS NOT REVERSIBLE. This hash will produce different results for hash("WXYZABCD", 8) and hash("ABCD", 4). The answer by #Nicholas Pipitone DOES NOT produce different outputs for these different inputs.
Once you compute this hash, there is no way to get the original string back. If you want to keep knowledge of the original string, you MUST keep the original string as a variable.
int hash(char* buffer, size_t size) {
int res = 0;
for (size_t i = 0; i < size; ++i) {
res += buffer[i];
res *= 31;
}
return res;
}
Here's how to convert the first sizeof(int) bytes of the char array to an int:
int val = *(unsigned int *)buffer;
and to convert in back:
*(unsigned int *)buffer = val;
Note that your buffer must be at least the length of your int type size. You should check for this.

Store string as hex without converting

I have a string value:
string str = "2018";
Now I have to store in unsigned char array as hex representation but not really convert to hex:
unsigned char data [2]; //[0x20,0x18]
If I do it this way
data[0] = 0x20;
data[1] = 0x18;
It works, but my input is string, how I can resolve it?
Edit
If my input is unsigned char instead of string like
unsigned char y1 = 20;
unsigned char y2 = 18;
Is there any better way?.
A brief research made me find this function QString::toInt(bool&, int) which can be useful for your intent.
Basically you could:
if(str.size() % 2 == 1){
str = '0' + str;
}
for(int i = 0; i < str.size() / 2; i++){
data[i] = (str[2*i] + str[2*i+1]).toInt(res, 16);
}
I did not try this code, there surely a better way to extract the substring, and probably a more efficient way than to iterate over it.
Perhaps you could try something like this:
#include <iostream>
int main()
{
std::string s = "2018";
unsigned i;
std::sscanf(s.c_str(), "%04x", &i);
unsigned char data[2];
data[0] = i >> 8;
data[1] = i;
std::cout << std::hex << (int)data[0] << " " << (int)data[1] << std::endl;
return 0;
}
https://ideone.com/SyYKUl
Prints:
20 18
If you can assume the string to have 4 digits, you can convert it to BCD format simply and efficiently this way:
void convert_to_bcd4(unsigned char *data, const char *str) {
data[0] = (str[0] - '0') * 16 + (str[1] - '0');
data[1] = (str[2] - '0') * 16 + (str[3] - '0');
}
You can complete the conversion of "2018" to 0x20 0x18 using a hex string to binary converter. I think, for example, sscanf("%x",....) will do this. This typically gives an int. You can extract the byte values from the int in the normal way. (This method does not check for errors.)

extracting integral type from byte array

I'm writing an integral type to a byte array like this:
unsigned char Data[10]; // Example byte array
signed long long Integer = 1283318; // arbitrary value
for (int i = 0; i < NumBytes; ++i)
Data[i] = (Integer >> (i * 8)) & 0xff; // Set the byte
In this context, NumBytes is the number of bytes actually being written to the array, which can change - sometimes I'll be writing a short, sometimes a int, etc.
In a test case where I know NumBytes == 2, this works to retrieve the integral value:
signed short Integer = (Data[0] << 0) | (Data[1] << 8);
Based on this, I tried to do the same with a long long, so it would work for an arbitrary integral type:
signed long long Integer = 0;
for (int i = 0; i < NumBytes; ++i)
Integer |= static_cast<signed long long>(Data[i]) << (i * 8);
But, this fails when Integer < 0. I'd be thankful if someone could point out what I'm missing here. Am I omitting the sign bit? How would I make sure this is included in a portable way?
Cheers!
This works:
#include <iostream>
using namespace std;
int main() {
signed short Input = -288;
int NumBytes = sizeof(signed long long);
unsigned char Data[10]; // Example byte array
signed long long Integer = Input; // arbitrary value
std::cout << Integer << std::endl;
for (int i = 0; i < NumBytes; ++i)
Data[i] = (Integer >> (i * 8)) & 0xff; // Set the byte
signed long long Integer2 = 0;
for (int i = 0; i < NumBytes; ++i)
Integer2 |= static_cast<signed long long>(Data[i]) << (i * 8);
std::cout << Integer2 << std::endl;
return 0;
}
When you turn the short into the long long as you did in your code, the sign bit becomes the most significant bit in the long long, which means to correctly encode / decode it you need the all 8 bytes.

segmentation fault and array issue

I've been going back trying to find a segmentation error in my program. Very often when the program crashes it is at this point.
unsigned long data = octets[g];
So I have tracked this buffer as being created in the main loop with a fixed defined size. However since it's defined in a if statement in main does it need to be allocated with "new"? Basically after receiving from a TCP socket the char buffer is copied to an unsigned char buffer to check for certain binary data types. So only if data arrives is this called into existance.
INT8U byteArray[BUFFERSIZE];
This buffer is then passed for message ID and crc checking. Is not doing a "new" type allocation the issue because it is in the main loop? I thought it would go out of scope at the end of the "if new data is received" statement.
long calc_crc24q(byte* octets, int start, int last) //start is first byte, last is MSbyte of CRC
{
long crc = CRC24SEED;
for(int g = start; g < last; g++) //should xor from preamble to the end of data
{
unsigned long data = octets[g]; //fault occurs here often
crc = crc ^ data << 16;
for (int i = 0; i < 8; i++)
{
crc <<= 1;
if (crc & 0x1000000)
crc = crc ^ CRC24POLY;
}
}
return crc & 0x00ffffff; //returns an int value with high byte 00 then data in lower 3 bytes
}
//---------------------------------------------
Here is the message id
unsigned int id_message(INT8U* buffer, unsigned int posStart, unsigned int numberbytes, unsigned int& messageLength)
{
unsigned int messID = 0;
unsigned int posEnd;
unsigned int noBytes = 0;
if(buffer[posStart] == Preamble)
{
unsigned int dataLength = (((0x0000 | buffer[posStart+1]) << 8) | buffer[posStart+2]); //0x byte1 byte2
messID = ((0x0000 | (buffer[posStart+3] << 4)) | ((buffer[posStart+4] >> 4) & 0x0F)); //byte1 shift 4 bits add upper 4 bits of byte 2
noBytes = dataLength + 6;
//numberbytes = noBytes;
posEnd = posStart + noBytes - 1;
if(calc_crc24q( buffer, posStart, posEnd-2) != (((0x00000000 | buffer[posEnd-2]) << 16) | ((0x00000000 | buffer[posEnd-1]) << 8) | (0x00000000 | buffer[posEnd])) )
{
cout << "CRC error" << endl;
return 0;
}
//return message type extracted from data segment
messageLength = posStart + noBytes;
return messID;
}
return 255; //unknown type
}

Split array of m bytes into chunks of n bytes

I'm working on a program that manipulates brain data. It recieves a value represents the current magnitude of 8 commonly-recognized types of EEG (brain-waves). This data value is output as a series of eight 3-byte unsigned integers in little-endian format.
Here is a piece of my code:
if (extendedCodeLevel == 0 && code == ASIC_EEG_POWER_CODE)
{
fprintf(arq4, "EXCODE level: %d CODE: 0x%02X vLength: %d\n", extendedCodeLevel, code, valueLength );
fprintf(arq4, "Data value(s):" );
for( i=0; i<valueLength; i++ ) fprintf(arq4, " %d", value[0] & 0xFF );
}
The value value[0] is my output. It is the series of bytes that represents the brain waves. The current output file contains is the following data:
EXCODE level: 0x00 CODE: 0x83 vLength: 24
Data value(s): 16 2 17 5 3 2 22 1 2 1 0 0 0 4 0 0 3 0 0 5 1 0 4 8
What I need is to divide the sequence of bytes above into 3-byte chunks, to identify the EEG. The wave delta is represented by the first 3-byte sequence, theta is represented by the next bytes, and so on. How can I do it?
Assuming that you know that your input will always be exactly eight three-bit integers, all you need is a simple loop that reads three bytes from the input and writes them out as a four-byte value. The easiest way to do this is to treat the input as an array of bytes and then pull bytes off of this array in groups of three.
// Convert an array of eight 3-byte integers into an array
// of eight 4-byte integers.
void convert_3to4(const void* input, void* output)
{
uint32_t tmp;
uint32_t* pOut = output;
uint8_t* pIn = input;
int i;
for (i=0; i<24; i+=3)
{
tmp = pIn[i];
tmp += (pIn[i+1] << 8);
tmp += (pIn[i+2] << 16);
pOut[((i+2) / 3)] = tmp;
}
}
Like this? The last bytes are not be printed if are not aligned by 3. Do you need them?
for( i=0; i<valueLength; i+=3 ) fprintf(arq4, "%d %d %d - ", value[i] & 0xFF,
value[i+1] & 0xFF,
value[i+2] & 0xFF );
Converting eight 3-byte little endian character-steams into eight 4-byte integers is fairly trivial:
for( int i = 0; i < 24; ++i )
{
output[ i & 0x07 ] |= input[ i ] << ( i & 0x18 );
}
I think that (untested) code will do it. Assuming input is a 24-entry char array, and output is an eight-entry int array.
You might try s.th. like this:
union _32BitValue
{
uint8_t bytes[4];
uint32_t uval;
}
size_t extractUint32From3ByteSegemented(const std::vector<uint8_t>& rawData, size_t index, uint32_t& result)
{
// May be do some checks, if the vector size fits extracting the data from it,
// throwing exception or return 0 etc. ...
_32BitValue tmp;
tmp.bytes[0] = 0;
tmp.bytes[1] = rawData[index + 2];
tmp.bytes[2] = rawData[index + 1];
tmp.bytes[3] = rawData[index];
result = ntohl(tmp.uval);
return index + 3;
}
The code used to parse the values from the raw data array:
size_t index = 0;
std::vector<uint8_t> rawData = readRawData(); // Provide such method to read the raw data into the vector
std::vector<uint32_t> myDataValues;
while(index < rawData.size())
{
uint32_t extractedValue;
index = extractUint32From3ByteSegemented(rawData,index,extractedValue);
// Depending on what error detection you choose do check for index returned
// != 0, or catch exception ...
myDataValues.push_back(extractedValue);
}
// Continue with calculations on the extracted values ...
Using the left shift operator and addition as shown in other answers will do the trick as well. But IMHO this sample shows clearly what's going on. It fills the unions byte array with a value in big-endian (network) order and uses ntohl() to retrieve the result in the host machine's used format (big- or little-endian) portably.
What I need is, instead of displaying the whole sequence of 24 bytes, I need to get the 3-byte sequences separately.
You can easily copy the 1d byte array to the desired 2d shape.
Example:
#include <inttypes.h>
#include <stdio.h>
#include <string.h>
int main() {
/* make up data */
uint8_t bytes[] =
{ 16, 2, 17,
5, 3, 2,
22, 1, 2,
1, 0, 0,
0, 4, 0,
0, 3, 0,
0, 5, 1,
0, 4, 8 };
int32_t n_bytes = sizeof(bytes);
int32_t chunksize = 3;
int32_t n_chunks = n_bytes/chunksize + (n_bytes%chunksize ? 1 : 0);
/* chunkify */
uint8_t chunks[n_chunks][chunksize];
memset(chunks, 0, sizeof(uint8_t[n_chunks][chunksize]));
memcpy(chunks, bytes, n_bytes);
/* print result */
size_t i, j;
for (i = 0; i < n_chunks; i++)
{
for (j = 0; j < chunksize; j++)
printf("%02hhd ", chunks[i][j]);
printf("\n");
}
return 0;
}
The output is:
16 02 17
05 03 02
22 01 02
01 00 00
00 04 00
00 03 00
00 05 01
00 04 08
I used some of the examples here to come up with a solution, so I thought I'd share it. It could be a basis for an interface so that objects can transmit copies of themselves over a network with the hton and ntoh functions, which is actually what I am trying to do.
#include <iostream>
#include <string>
#include <exception>
#include <arpa/inet.h>
using namespace std;
void DispLength(string name, size_t var){
cout << "The size of " << name << " is : " << var << endl;
}
typedef int8_t byte;
class Bytes {
public:
Bytes(void* data_ptr, size_t size)
: size_(size)
{ this->bytes_ = (byte*)data_ptr; }
~Bytes(){ bytes_ = NULL; } // Caller is responsible for data deletion.
const byte& operator[] (int idx){
if((size_t)idx <= size_ && idx >= 0)
return bytes_[idx];
else
throw exception();
}
int32_t ret32(int idx) //-- Return a 32 bit value starting at index idx
{
int32_t* ret_ptr = (int32_t*)&((*this)[idx]);
int32_t ret = *ret_ptr;
return ret;
}
int64_t ret64(int idx) //-- Return a 64 bit value starting at index idx
{
int64_t* ret_ptr = (int64_t*)&((*this)[idx]);
int64_t ret = *ret_ptr;
return ret;
}
template <typename T>
T retVal(int idx) //-- Return a value of type T starting at index idx
{
T* T_ptr = (T*)&((*this)[idx]);
T T_ret = *T_ptr;
return T_ret;
}
protected:
Bytes() : bytes_(NULL), size_(0) {}
private:
byte* bytes_; //-- pointer used to scan for bytes
size_t size_;
};
int main(int argc, char** argv){
long double LDouble = 1.0;
Bytes bytes(&LDouble, sizeof(LDouble));
DispLength(string("LDouble"), sizeof(LDouble));
DispLength(string("bytes"), sizeof(bytes));
cout << "As a long double LDouble is " << LDouble << endl;
for( int i = 0; i < 16; i++){
cout << "Byte " << i << " : " << bytes[i] << endl;
}
cout << "Through the eyes of bytes : " <<
(long double) bytes.retVal<long double>(0) << endl;
return 0;
}
you can use bit manipulation operators
I would use, not following actual code, just show example
(for I =0 until 7){
temp val = Value && 111 //AND operation with 111
Value = Value >> 3; //to shift right
}
Some self documenting, maintainable code might look something like this (untested).
typedef union
{
struct {
uint8_t padding;
uint8_t value[3];
} raw;
int32_t data;
} Measurement;
void convert(uint8_t* rawValues, int32_t* convertedValues, int convertedSize)
{
Measurement sample;
int i;
memset(&sample, '\0', sizeof(sample));
for(i=0; i<convertedSize; ++i)
{
memcpy(&sample.raw.value[0], &rawValues[i*sizeof(sample.raw.value)], sizeof(sample.raw.value));
convertedValues[i]=sample.data;
}
}