I have written a small application which works at some point with binary data. In unit tests, I compare this data with the expected one. When an error occurs, I want the test to display the hexadecimal output such as:
Failure
Expected: string_to_hex(expected, 11)
Which is: "01 43 02 01 00 65 6E 74 FA 3E 17"
To be equal to: string_to_hex(writeBuffer, 11)
Which is: "01 43 02 01 00 00 00 00 98 37 DB"
In order to display that (and to compare binary data in the first place), I used the code from Stack Overflow, slightly modifying it for my needs:
std::string string_to_hex(const std::string& input, size_t len)
{
static const char* const lut = "0123456789ABCDEF";
std::string output;
output.reserve(2 * len);
for (size_t i = 0; i < len; ++i)
{
const unsigned char c = input[i];
output.push_back(lut[c >> 4]);
output.push_back(lut[c & 15]);
}
return output;
}
When checking for memory leaks with valgrind, I fould a lot of errors such as this one:
Use of uninitialised value of size 8
at 0x11E75A: string_to_hex(std::__cxx11::basic_string, std::allocator > const&, unsigned long)
I'm not sure to understand it. First, everything seems initialized, including, I'm mistaken, output. Moreover, there is no mention of size 8 in the code; the value of len varies from test to test, while valgrind reports the same size 8 every time.
How should I fix this error?
So this is one of the cases where passing a pointer to char that points to buffer filled with arbitrary binary data into evil implicit constructor of std::string class was causing string to be truncated to first \0. Straightforward approach would be to pass a raw pointer but a better solution is to start using array_view span or similar utility classes that will provide index validation at least in debug build for both input and lut.
Related
I'm building something which uses the DS18B20 temperature sensor.
First I am trying to understand the example CRC in the Maxim application note 27, "Understanding and Using Cyclic Redundancy Checks with Maxim iButton Products" (https://www.analog.com/en/technical-articles/understanding-and-using-cyclic-redundancy-checks-with-maxim-1wire-and-ibutton-products.html).
It doesn't look too hard to code the conversion but my problem is that I cannot find any calculator that gives me the correct answer of 0xA2.
On page 5, example 2 the complete ROM code is given in hex as A2=(CRC), 00 00 00 01 B8 1C 02=(Family code).
The generator polynomial is 100110001 (X8+X5+X4+1).
On the site https://crccalc.com/ it has a CRC-8/MAXIM algorithm which has the correct generator but the RefIn and RefOut are both true whereas I cannot see anything in the application note about reversing parts (although I have tried this).
On the site https://tomeko.net/online_tools/crc8.php?lang=en it claims to implement the CRC from the application note but it gives the same answers as crccalc.com. Also note that crccalc has the same lookup table for the Maxim algorithm as the application note so no surprise the two web sites are giving the same answers.
Finally I found a site, https://www.rndtool.info/CRC-step-by-step-calculator/, that allows me to add the polynomial and bit stream in binary and it shows the 'hand' calculation of the CRC. This says nothing about input and output refs so I assume they are false. This gives different answers to the other two sites probably because of the ref values but still does not give 0xA2.
Has anyone correctly calculated the given value in the application note?
I don't want to start programming until I understand what is going on and I cannot read data from a device if I cannot decipher the CRC correctly. This is driving me mad at the moment. I've tried the number reflected and forwards with the generator reflected and forwards plus reversing the answer but I never get 0xA2.
You didn't give a language in your tags. Here is an example in C:
#include <stdio.h>
unsigned crc8maximdow(unsigned char *data, size_t len) {
unsigned crc = 0;
for (size_t i = 0; i < len; i++) {
crc ^= data[i];
for (unsigned k = 0; k < 8; k++)
crc = crc & 1 ? (crc >> 1) ^ 0x8c : crc >> 1;
}
return crc;
}
int main(void) {
unsigned char data[] = {2, 0x1c, 0xb8, 1, 0, 0, 0};
printf("0x%02x\n", crc8maximdow(data, sizeof(data)));
return 0;
}
That prints 0xa2.
Using Mark's answer above I worked out what I was doing wrong with the Maxim example. Tutorials were talking about 'reflecting' the input data so I literally took the 56-bit bit pattern and reversed it which gave the wrong answer in the site https://crccalc.com/. Instead I reversed the hex numbers so that 00 00 00 01 B8 1C 02 becomes 02 1C B8 01 00 00 00 which gives the correct CRC of 0xA2 on the website. Hope this may help anyone else having the same problem.
So I have this data packet that I want to send it to my device using TCP/IP protocol. My array is:
unsigned char array1[] = {'0x00', '0x84', '0x00', '0x00', '0x00', '0x06', '0x54', '0x01', '0x00', '0x01', '0x00', '0x03'};
I want this to convert into a string. How do I do it?
Right now I am just manually writing down the decimal equivalent:
unsigned char array1[] = {0,132,0,0,0,6,84,5,0,2,255,0};
and converting it into string:
std::string data ( array1, array1 + sizeof array1 / sizeof array1[0] );
However, I wonder can I use my hex packet just like a string directly?
string x= "00 84 00 00 00 06 54 05 00 02 FF 00";
Also is there a way I can design my message header which is the first 7 bytes that dont change? What changes is the rest of the part?
The following code should do what you need.
std::string s { "\x00\x01\x02\x03\x04", 5 };
Use the std::string constructor that also takes the length aka number of bytes.
I have a 320Mb binary file (data.dat), containing 32e7 lines of hex numbers:
1312cf60 d9 ff e0 ff 05 00 f0 ff 22 00 2f 00 fe ff 33 00 |........"./...3.|
1312cf70 00 00 00 00 f4 ff 1d 00 3d 00 6d 00 53 00 db ff |........=.m.S...|
1312cf80 b7 ff b0 ff 1e 00 0c 00 67 00 d1 ff be ff f8 ff |........g.......|
1312cf90 0b 00 6b 00 38 00 f3 ff cf ff cb ff e4 ff 4b 00 |..k.8.........K.|
....
Original numbers were:
(16,-144)
(-80,-64)
(-80,16)
(16,48)
(96,95)
(111,-32)
(64,-96)
(64,-16)
(31,-48)
(-96,-48)
(-32,79)
(16,48)
(-80,80)
(-48,128)
...
I have a matlab code which can read them as real numbers and convert them to complex numbers:
nsamps = (256*1024);
for i = 1:305
nstart = 1 + (i - 1) * nsamps ;
fid = fopen('data.dat');
fseek(fid,4 * nstart ,'bof');
y = fread(fid,[2,nsamps],'short');
fclose(fid);
x = complex(y(1,:),y(2,:));
I am using C++ and trying to get data as a vector<complex<float>>:
std::ifstream in('data.dat', std::ios_base::in | std::ios_base::binary);
fseek(infile1, 4*nstart, SEEK_SET);
vector<complex<float> > sx;
in.read(reinterpret_cast<char*>(&sx), sizeof(int));
and very confuse to get complex data using C++. Can anyone give me a help?
Theory
I'll try to explain some points using the issues in your code as examples.
Let's start from the end of the code. You try to read a number, which is stored as a four-byte single-precision floating point number, but you use sizeof(int) as a size argument. While on modern x86 platforms with modern compilers sizeof(int) tends to be equal to sizeof(float), it's not guaranteed. sizeof(int) is compiler dependent, so please use sizeof(float) instead.
In the matlab code you read 2*nsamps numbers, while in C++ code only four bytes (one number) is being read. Something like sizeof(float) * 2 * nsamps would be closer to matlab code.
Next, std::complex is a complicated class, which (in general) may have any implementation-defined internal representation. But luckily, here we read that
For any object z of type complex<T>, reinterpret_cast<T(&)[2]>(z)[0]
is the real part of z and reinterpret_cast<T(&)[2]>(z)[1] is the
imaginary part of z.
For any pointer to an element of an array of complex<T> named p and
any valid array index i, reinterpret_cast<T*>(p)[2*i] is the real part
of the complex number p[i], and reinterpret_cast<T*>(p)[2*i + 1] is
the imaginary part of the complex number p[i].
so we can just cast an std::complex to char type and read binary data there. But std::vector is a class template with it's implementation-defined internal representation as well! It means, that we can't just reinterpret_cast<char*>(&sx) and write binary data to the pointer, as it points to the beginning of the vector object, which is unlikely to be the beginning of the vector data. Modern C++ way to get the beginning of the data is to call sx.data(). Pre-C++11 way is to take an address of the first element: &sx[0]. Overwriting the object from the beginning will result in segfault almost always.
OK, now we have the beginning of the data buffer which is able to receive binary representation of complex numbers. But when you declared vector<complex<float> > sx;, it got zero size, and as you are not pushing or emplacing it's elements, the vector will not "know" that it should resize. Segfault again. So just call resize:
sx.resize(number_of_complex_numbers_to_store);
or use an appropriate constructor:
vector<complex<float> > sx(number_of_complex_numbers_to_store);
Before writing data to the vector. Note that these methods operate with "high-level" concept of number of stored elements, not number of bytes to store.
Putting it all together, the last two lines of your code should look like:
vector<complex<float> > sx(nsamps);
in.read(reinterpret_cast<char*>(sx.data()), 2 * nsamps * sizeof(float));
Minimal example
If you continue having troubles, try a simpler sandbox code first.
For example, let's write six floats to a binary file:
std::ofstream ofs("file.dat", std::ios::binary | std::ios::out);
float foo[] = {1,2,3,4,5,6};
ofs.write(reinterpret_cast<char*>(foo), 6*sizeof(float));
ofs.close();
then read them to a vector of complex:
std::ifstream ifs("file.dat", std::ios::binary | std::ios::in);
std::vector<std::complex<float>> v(3);
ifs.read(reinterpret_cast<char*>(v.data()), 6*sizeof(float));
ifs.close();
and, finally, print them:
std::cout << v[0] << " " << v[1] << " " << v[2] << std::endl;
The program prints:
(1,2) (3,4) (5,6)
so this approach works fine.
Binary files
Here is the remark about binary files which I initially posted as a comment.
Binary files haven't got the concept of "lines". The number of "lines" in binary file completely depends on the size of the window you are viewing it in. Think of binary files as of a magnetic tape, where each discrete position of the head is able to read only one byte. Interpretation of those bytes is up to you.
If everything should work fine, but you get weird numbers, check the displacement in fseek call. A mistake by a number of bytes yields random-looking values instead of the floats you wish to get.
Surely, you might just read a vector (or an array) of floats, observing the above considerations, and then convert them to complex numbers in a loop. Also, it's a good way to debug your fseek call to make sure that you start reading from the right place.
I have a std::vector<char> buffer in memory with a number at a specific offset, e.g.
00 00 00 00 00 00 00 00 00 33 2E 31 34 99 99 99 .........3.14™™™
I know the end and start offset to read the double/float value, but right now I'm copying the relevant part with std::copy() into a std::string and then calling std::stod. My question is: how can I make this faster?
There must be a way to avoid the copy.. for instance: can I point a stream to a specific offset in another buffer? Or something similar perhaps
If the numbers were delimited, then using strtod directly on the buffer like Let_Me_Be suggests is efficient. However, since the numbers are not delimited, you cannot use strtod directly.
If the buffer is zero (or eof) terminated, then you can simply modify it, by adding the terminator after the number, and then restore the original character, like bolov suggested. Since the end offset is part of the number, there's always at least the terminator after it, so offset_end won't overflow. The following code assumes that offset_end is one past the last character. If it's the last character, then simply use + 1.
auto original = data[offset_end];
data[offset_end] = '\0';
auto result = strtod(&data[offset_start], nullptr);
data[offset_end] = original;
Even, if the buffer is not terminated, you can still do that, but only if the number is not at the very end. If it is, or if you don't know where the buffer ends, or the buffer is const, then your current solution is as efficient as it gets.
If you know the offset then simply:
vector<char> data;
// ... snip ...
char *endp = null;
double result = strtod(&data[offset],&endp);
Note: This assumes that the number is followed by non-numeric characters (or end of string).
What I must do is open a file in binary mode that contains stored data that is intended to be interpreted as integers. I have seen other examples such as Stackoverflow-Reading “integer” size bytes from a char* array. but I want to try taking a different approach (I may just be stubborn, or stupid :/). I first created a simple binary file in a hex editor that reads as follows.
00 00 00 47 00 00 00 17 00 00 00 41
This (should) equal 71, 23, and 65 if the 12 bytes were divided into 3 integers.
After opening this file in binary mode and reading 4 bytes into an array of chars, how can I use bitwise operations to make char[0] bits be the first 8 bits of an int and so on until the bits of each char are part of the int.
My integer = 00 00 00 00
+ ^ ^ ^ ^
Chars Char[0] Char[1] Char[2] Char[3]
00 00 00 47
So my integer(hex) = 00 00 00 47 = numerical value of 71
Also, I don't know how the endianness of my system comes into play here, so is there anything that I need to keep in mind?
Here is a code snippet of what I have so far, I just don't know the next steps to take.
std::fstream myfile;
myfile.open("C:\\Users\\Jacob\\Desktop\\hextest.txt", std::ios::in | std::ios::out | std::ios::binary);
if(myfile.is_open() == false)
{
std::cout << "Error" << std::endl;
}
char* mychar;
std::cout << myfile.is_open() << std::endl;
mychar = new char[4];
myfile.read(mychar, 4);
I eventually plan on dealing with reading floats from a file and maybe a custom data type eventually, but first I just need to get more familiar with using bitwise operations.
Thanks.
You want the bitwise left shift operator:
typedef unsigned char u8; // in case char is signed by default on your platform
unsigned num = ((u8)chars[0] << 24) | ((u8)chars[1] << 16) | ((u8)chars[2] << 8) | (u8)chars[3];
What it does is shift the left argument a specified number of bits to the left, adding zeros from the right as stuffing. For example, 2 << 1 is 4, since 2 is 10 in binary and shifting one to the left gives 100, which is 4.
This can be more written in a more general loop form:
unsigned num = 0;
for (int i = 0; i != 4; ++i) {
num |= (u8)chars[i] << (24 - i * 8); // += could have also been used
}
The endianness of your system doesn't matter here; you know the endianness of the representation in the file, which is constant (and therefore portable), so when you read in the bytes you know what to do with them. The internal representation of the integer in your CPU/memory may be different from that of the file, but the logical bitwise manipulation of it in code is independent of your system's endianness; the least significant bits are always at the right, and the most at the left (in code). That's why shifting is cross-platform -- it operates at the logical bit level :-)
Have you thought of using Boost.Spirit to make a binary parser? You might hit a bit of a learning curve when you start, but if you want to expand your program later to read floats and structured types, you'll have an excellent base to start from.
Spirit is very well-documented and is part of Boost. Once you get around to understanding its ins and outs, it's really mind-boggling what you can do with it, so if you have a bit of time to play around with it, I'd really recommend taking a look.
Otherwise, if you want your binary to be "portable" - i.e. you want to be able to read it on a big-endian and a little-endian machine, you'll need some sort of byte-order mark (BOM). That would be the first thing you'd read, after which you can simply read your integers byte by byte. Simplest thing would probably be to read them into a union (if you know the size of the integer you're going to read), like this:
union U
{
unsigned char uc_[4];
unsigned long ui_;
};
read the data into the uc_ member, swap the bytes around if you need to change endianness and read the value from the ui_ member. There's no shifting etc. to be done - except for the swapping if you want to change endianness..
HTH
rlc