MODBUS (RTU mode) CRC calculation... what's wrong? it's a misprint of the DPS5020 user manual? - crc

I am analyzing the MODBUS protocol (rs232 com port) used in the DPS5020 power supply module and I cannot understand the CRC calculation method in RTU mode (page 3)
In the first example on page 4 for sending bytes 1, 3,0,2,0,2 the value CRC = 65CB (Hex) is indicated (2 byte swapped).
I've also tried several CRC calculators online but can't find the right value.
I also did a step-by-step diagram of the calculation and the right rotation of the bits, but the values ​​do not return to me.
Is it necessary to use all the bytes of the frame (6) for the calculation or only the data values ​​(4)? I have tried both without success...
Could you kindly put a little diagram of how the calculation is done and the return values ​​step by step (16 bit xor with A001 value, rotate right yes / no ... etc)?
I know that in the end you have to swap the 2 bytes between them but the single values ​​do not come back to me anyway.
Or is it simply a misprint of the manual?

All bytes in the frame are used in the CRC calculation.
Here is a C implementation of the CRC, which should answer your question about exactly what to shift and exclusive-or when:
#include <stddef.h>
#include <stdint.h>
uint16_t crc16modbus_bit(uint16_t crc, void const *mem, size_t len) {
unsigned char const *data = mem;
if (data == NULL)
return 0xffff;
for (size_t i = 0; i < len; i++) {
crc ^= data[i];
for (unsigned k = 0; k < 8; k++) {
crc = crc & 1 ? (crc >> 1) ^ 0xa001 : crc >> 1;
return crc;
(The initial CRC value is returned when called with mem equal to NULL.)


crc32 function explanation with regards to data streams

Where I can find this crc32() function in detail? I saw this Link but I couldn't figure out how CRC is being calculated.
I asked a question about updating the CRC based on the data stream instead of waiting to have all the data. I got the answer as follows (Thanks to #MarkAdler):
unsigned long crc = crc32(0, NULL, 0); // initial CRC
for (...) { // some sort of loop
... // generating a chunk of data
crc = crc32(crc, buf, len); // update the CRC with the data
... // this all gets repeated many times
... // loop is done, crc has the CRC
Can you please be more specific about the crc32() function?
Is there any pseudo code for that function which explains it?
And for loop here is for getting data right?
Thank you
More specific how?
Do you want an explanation of how to use it, or how it works internally? How it works internally is in the source code you linked.
Yes. You would get chunks of data from wherever, and feed them through crc32() to get the CRC-32/ISO-HDLC of the entire stream of data.
As for the source code you linked, that was not written to teach anyone about how CRC's work. It was written to be fast. Here is a simple, but slow (one bit at a time), CRC routine in C, that may, or may not, help you with what you're looking for:
uint32_t crc32iso_hdlc_bit(uint32_t crc, void const *mem, size_t len) {
unsigned char const *data = mem;
if (data == NULL)
return 0;
crc = ~crc;
for (size_t i = 0; i < len; i++) {
crc ^= data[i];
for (unsigned k = 0; k < 8; k++) {
crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
crc = ~crc;
return crc;
(That code was generated by crcany.)
If you would like to learn more about CRCs, start here.

How to improve time efficiency of CRC-5 calculation?

I have just started to study the CRC and how to implement it in software. My information source is mainly following document. Here is mentioned some simple algorithm for calculating CRC for any generator polynomial. I have attempted to write this algorithm in C++ language. I have tested it for generator polynomial x^5 + x^4 + x^2 + 1 (CRC-5) (generator polynomial used by chip) with usage of this online calculator and it works.
#include <iostream>
using namespace std;
int main() {
uint8_t data_byte = 0x31;
// polynom x^5 + x^4 + x^2 + 1
uint16_t polynom = 0x35;
// register contains 0 at the beginning
uint32_t crc = 0;
uint32_t message = 0;
// shift the message byte to left by so many bits which is needed for generator polynomial
message = (data_byte << 5);
// now the message byte is 13 bits long
uint8_t processed_bit = 13;
while(processed_bit > 0) {
// prepare free space for new bit from the message byte
crc = crc << 1;
// find out value of current msb in the message byte
message = message << 1;
if(message & 0x2000) {
// msb in message byte is "1"
// lsb in register is set to "1"
crc |= 1U;
} else {
// msb in message byte is "0"
// lsb in register is set to "0"
crc &= ~1U;
// remove msb from message byte
message = message & ~0x2000;
if(crc & 0x20) {
// subtract current multiple of the generator polynomial
crc = crc ^ polynom;
// remove msb from the register
crc = crc & ~0x20;
cout << "CRC: " << (int)crc << endl;
return 0;
It is obvious that this program is uneffective as far as execution time. So I have been thinking about a possibility how to improve it in this perspective. I know that there is a variant to use the look-up table containing the precalculated reminders but I would like to avoid this method. Does anybody know how to improve the above mentioned algorithm from the execution time perspective? Thanks in advance for any suggestions.
Just a quick glance shows several unnecessary statements. You don't need crc &= ~1U;, since the crc = crc << 1; already put a zero there. You don't need message = message & ~0x2000;, since you are only ever looking at one bit in there. Just let the other bits shift up and away. You don't need the crc = crc & ~0x20;, since the exclusive-or with the polynomial already did that.
If you read the document you linked, you will find that you do not need to process five more bits (13 total). You only need to process the eight message bits. Also reading that document, you do not need to feed in the message bits one at a time. You can exclusive-or the message byte directly onto the CRC register, and then process the eight bits all in the register.
Finally, you can speed up the calculation significantly with a table look up, processing eight bits at a time instead of one bit at a time. This is also described beautifully in the document you linked. You can find code here to automatically generate the table and C code for the calculation.
In the end though, none of this matters if you're not calculating the right thing to begin with. You need to verify the calculation with data from the chip first. I found this document with details on the CRC calculation for that chip. You need to spend some time with it and understand it in detail.
To answer your question directly, here is some code that does what your code does, but is much simpler. Also it is extended to work on n bits, not just eight. It does n loops instead of n+5 loops:
// Return a CRC-5 of the low n bits of data. The remaining bits of data must be
// zero. n must be in [5..32].
uint8_t crc5(uint32_t data, int n) {
int shift = n - 5;
uint32_t poly = (uint32_t)0x15 << shift;
uint32_t top = (uint32_t)1 << (n - 1);
do {
data = data & top ? (data << 1) ^ poly : data << 1;
} while (--n);
return (data >> shift) & 0x1f;
Simpler and faster still is the equivalent of yours restricted to eight bits, unrolled:
uint8_t crc5_8(uint8_t data) {
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
return data >> 3;
However neither can calculate what you need for your chip.

How to grab specific bits from a 256 bit message?

I'm using winsock to receive udp messages 256 bits long. I use 8 32-bit integers to hold the data.
int32_t dataReceived[8];
recvfrom(client, (char *)&dataReceived, 8 * sizeof(int), 0, &fromAddr, &fromLen);
I need to grab specific bits like, bit #100, #225, #55, etc. So some bits will be in dataReceived[3], some in dataReceived[4], etc.
I was thinking I need to bitshift each array, but things got complicated. Am I approaching this all wrong?
Why are you using int32_t type for buffer elements and not uint32_t?
I usually use something like this:
int bit_needed = 100;
uint32_t the_bit = dataReceived[bit_needed>>5] & (1U << (bit_needed & 0x1F));
Or you can use this one (but it won't work for sign in signed integers):
int bit_needed = 100;
uint32_t the_bit = (dataReceived[bit_needed>>5] >> (bit_needed & 0x1F)) & 1U;
In other answers you can access only lowes 8bits in each int32_t.
When you count bits and bytes from 0:
int bit_needed = 100;
int byte = int(bit_needed / 8);
int bit = bit_needed % 8;
int the_bit = dataReceived[byte] & (1 << bit);
If the recuired bit contains 0, then the_bit will be zero. If it's 1, then the_bit will hold 2 to the power of that bit ordinal place within the byte.
You can make a small function to do the job.
uint8_t checkbit(uint32_t *dataReceived, int bitToCheck)
byte = bitToCheck/32;
bit = bitToCheck - byte*32;
if( dataReceived[byte] & (1U<< bit))
return 1;
return 0;
Note that you should use uint32_t rather than int32_t, if you are using bit shifting. Signed integer bit shifts lead to unwanted results, especially if the MSbit is 1.
You can use a macro in C or C++ to check for specific bit:
#define bit_is_set(var,bit) ((var) & (1 << (bit)))
and then a simple if:
//bit is set

Split up 32 bit value in C++ and concatenate the chunks in MATLAB

I'm working on a project where I have to send values of 32 bits over UART to MATLAB where I need to print them in the MATLAB terminal. I do this by splitting up the 32 bit value into 8 bit values like so (:
void Configurator::send(void) {
* Split the 32 bits in chunks of 4 bytes of 8 bits
union {
uint32_t data;
uint8_t bytes[4];
} splitData; = 1234587;
for (int n : splitData.bytes) {
XUartPs_SendByte(STDOUT_BASEADDRESS, splitData.bytes[n]);
In MATLAB I receive the following 4 bytes:
Now the question is, how do I restore the 1234587?
Am I correct in creating an array of size 4 as uint8_t? I would also like to note that I'm using union for readability. If I'm doing it wrong, I'd be happy to hear why!
You could use left shift to restore the value
uint32_t value = (byte[3]<<24) + (byte[2]<<16) + (byte[1]<<8) + (byte[0]<<0);
Try to avoid using unions for this sort of thing. It is not (in principle) portable, and can cause undefined behaviour. Instead write it like this:
void Configurator::send(void) {
* Split the 32 bits in chunks of 4 bytes of 8 bits
uint32_t data = 1234587;
for (int n = 0; n<4; n++) {
unsigned char octet = (data >> (n*8)) & 0xFF;
XUartPs_SendByte(STDOUT_BASEADDRESS, octet);
uint32_t recieveBytes(
uint32_t result = 0;
for (int n = 0; n<4; n++)
unsigned char octet = getOctet();
uint32_t octet32 = octet;
result != octet32 << (n*8);
return result;
The point is that by shifting out byte like this, you avoid any problems with endianness. The masking also means that if either end has 32-bit chars (such platforms exist), it all works anyway.

Calculate_CRC32 function. How do I convert it to calculating bytes and not bits

I am helping out a friend of mine who is a bit stuck and my own c++ skills are very rusty. My interest and curiosity is quite picked by this. so I shall try and explain this as best I can. Note its a 32 bit check.
uint32_t CRC32::calculate_CRC32(const uint32_t* plData, uint32_t lLength, uint32_t previousCrc32)
uint32_t lCount;
const uint32_t lPolynomial = 0x04C11DB7;
uint32_t lCrc = previousCrc32;
unsigned char* plCurrent = (unsigned char*) plData;
lCrc ^= *plCurrent++;
while (lLength-- != 0)
for (lCount = 0 ; lCount < lLength; lCount++)
if (lCrc & 1)
lCrc = (lCrc >> 8) ^ lPolynomial;
lCrc = lCrc >> 8;
return lCrc;
Now ILength is the number of bytes that the packet contains. plData is the packet for which data needs to be checked. As it is, the function works. But it works bit for bit. It needs to be improved to work byte for byte. So to all genius c++ developers out there who far surpasses my knowledge. Any ideas will be really helpful. Thanks in advance guys.
Read Ross Williams excellent tutorial on CRCs, especially section 9 on "A Table-Driven Implementation", which calculates the CRC a byte at a time instead of a bit at a time. You can also look at the somewhat more involved CRC implementation in zlib, which calculates it four bytes at a time. You can also calculate it eight bytes at a time.