crc32 function explanation with regards to data streams - crc

Where I can find this crc32() function in detail? I saw this Link but I couldn't figure out how CRC is being calculated.
I asked a question about updating the CRC based on the data stream instead of waiting to have all the data. I got the answer as follows (Thanks to #MarkAdler):
unsigned long crc = crc32(0, NULL, 0); // initial CRC
for (...) { // some sort of loop
... // generating a chunk of data
crc = crc32(crc, buf, len); // update the CRC with the data
... // this all gets repeated many times
}
... // loop is done, crc has the CRC
Can you please be more specific about the crc32() function?
Is there any pseudo code for that function which explains it?
And for loop here is for getting data right?
Thank you

More specific how?
Do you want an explanation of how to use it, or how it works internally? How it works internally is in the source code you linked.
Yes. You would get chunks of data from wherever, and feed them through crc32() to get the CRC-32/ISO-HDLC of the entire stream of data.
As for the source code you linked, that was not written to teach anyone about how CRC's work. It was written to be fast. Here is a simple, but slow (one bit at a time), CRC routine in C, that may, or may not, help you with what you're looking for:
uint32_t crc32iso_hdlc_bit(uint32_t crc, void const *mem, size_t len) {
unsigned char const *data = mem;
if (data == NULL)
return 0;
crc = ~crc;
for (size_t i = 0; i < len; i++) {
crc ^= data[i];
for (unsigned k = 0; k < 8; k++) {
crc = crc & 1 ? (crc >> 1) ^ 0xedb88320 : crc >> 1;
}
}
crc = ~crc;
return crc;
}
(That code was generated by crcany.)
If you would like to learn more about CRCs, start here.

Related

MODBUS (RTU mode) CRC calculation... what's wrong? it's a misprint of the DPS5020 user manual?

I am analyzing the MODBUS protocol (rs232 com port) used in the DPS5020 power supply module and I cannot understand the CRC calculation method in RTU mode (page 3) https://cloud.kyme32.ro/ftp_backup/DPS5020%20PC%20Software(2017.11.04)/DPS5020%20CNC%20Communication%20%20Protocol%20V1.2.pdf.
In the first example on page 4 for sending bytes 1, 3,0,2,0,2 the value CRC = 65CB (Hex) is indicated (2 byte swapped).
I've also tried several CRC calculators online but can't find the right value.
I also did a step-by-step diagram of the calculation and the right rotation of the bits, but the values ​​do not return to me.
Is it necessary to use all the bytes of the frame (6) for the calculation or only the data values ​​(4)? I have tried both without success...
Could you kindly put a little diagram of how the calculation is done and the return values ​​step by step (16 bit xor with A001 value, rotate right yes / no ... etc)?
I know that in the end you have to swap the 2 bytes between them but the single values ​​do not come back to me anyway.
Or is it simply a misprint of the manual?
All bytes in the frame are used in the CRC calculation.
Here is a C implementation of the CRC, which should answer your question about exactly what to shift and exclusive-or when:
#include <stddef.h>
#include <stdint.h>
uint16_t crc16modbus_bit(uint16_t crc, void const *mem, size_t len) {
unsigned char const *data = mem;
if (data == NULL)
return 0xffff;
for (size_t i = 0; i < len; i++) {
crc ^= data[i];
for (unsigned k = 0; k < 8; k++) {
crc = crc & 1 ? (crc >> 1) ^ 0xa001 : crc >> 1;
}
}
return crc;
}
(The initial CRC value is returned when called with mem equal to NULL.)

How to improve time efficiency of CRC-5 calculation?

I have just started to study the CRC and how to implement it in software. My information source is mainly following document. Here is mentioned some simple algorithm for calculating CRC for any generator polynomial. I have attempted to write this algorithm in C++ language. I have tested it for generator polynomial x^5 + x^4 + x^2 + 1 (CRC-5) (generator polynomial used by chip) with usage of this online calculator and it works.
#include <iostream>
using namespace std;
int main() {
uint8_t data_byte = 0x31;
// polynom x^5 + x^4 + x^2 + 1
uint16_t polynom = 0x35;
// register contains 0 at the beginning
uint32_t crc = 0;
uint32_t message = 0;
// shift the message byte to left by so many bits which is needed for generator polynomial
message = (data_byte << 5);
// now the message byte is 13 bits long
uint8_t processed_bit = 13;
while(processed_bit > 0) {
// prepare free space for new bit from the message byte
crc = crc << 1;
// find out value of current msb in the message byte
message = message << 1;
if(message & 0x2000) {
// msb in message byte is "1"
// lsb in register is set to "1"
crc |= 1U;
} else {
// msb in message byte is "0"
// lsb in register is set to "0"
crc &= ~1U;
}
// remove msb from message byte
message = message & ~0x2000;
if(crc & 0x20) {
// subtract current multiple of the generator polynomial
crc = crc ^ polynom;
}
// remove msb from the register
crc = crc & ~0x20;
processed_bit--;
}
cout << "CRC: " << (int)crc << endl;
return 0;
}
It is obvious that this program is uneffective as far as execution time. So I have been thinking about a possibility how to improve it in this perspective. I know that there is a variant to use the look-up table containing the precalculated reminders but I would like to avoid this method. Does anybody know how to improve the above mentioned algorithm from the execution time perspective? Thanks in advance for any suggestions.
Just a quick glance shows several unnecessary statements. You don't need crc &= ~1U;, since the crc = crc << 1; already put a zero there. You don't need message = message & ~0x2000;, since you are only ever looking at one bit in there. Just let the other bits shift up and away. You don't need the crc = crc & ~0x20;, since the exclusive-or with the polynomial already did that.
If you read the document you linked, you will find that you do not need to process five more bits (13 total). You only need to process the eight message bits. Also reading that document, you do not need to feed in the message bits one at a time. You can exclusive-or the message byte directly onto the CRC register, and then process the eight bits all in the register.
Finally, you can speed up the calculation significantly with a table look up, processing eight bits at a time instead of one bit at a time. This is also described beautifully in the document you linked. You can find code here to automatically generate the table and C code for the calculation.
In the end though, none of this matters if you're not calculating the right thing to begin with. You need to verify the calculation with data from the chip first. I found this document with details on the CRC calculation for that chip. You need to spend some time with it and understand it in detail.
To answer your question directly, here is some code that does what your code does, but is much simpler. Also it is extended to work on n bits, not just eight. It does n loops instead of n+5 loops:
// Return a CRC-5 of the low n bits of data. The remaining bits of data must be
// zero. n must be in [5..32].
uint8_t crc5(uint32_t data, int n) {
int shift = n - 5;
uint32_t poly = (uint32_t)0x15 << shift;
uint32_t top = (uint32_t)1 << (n - 1);
do {
data = data & top ? (data << 1) ^ poly : data << 1;
} while (--n);
return (data >> shift) & 0x1f;
}
Simpler and faster still is the equivalent of yours restricted to eight bits, unrolled:
uint8_t crc5_8(uint8_t data) {
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
data = data & 0x80 ? (data << 1) ^ 0xa8 : data << 1;
return data >> 3;
}
However neither can calculate what you need for your chip.

Writing a C++ iostream that uses the RC4 stream cipher. How can I optimize my implementation?

I am implementing a custom iostream (i.e., with read, write, seek and close) which uses the RC4 stream cipher for encryption and decryption. One of the contracts of this stream is that it is bidirectional and calling code needs to be able to arbitrarily seek to any position in the stream before doing any actual reading or writing.
Now because RC4 utilizes a key that relies on all previous swap operations up to a given 'tell' position, how can I incorporate an ability to arbitrarily seek to any position?
Obviously I could seek up to the position of the given seek offset (marked by THIS BIT in the following example), before doing the actual xor-ing transformation process, something like,:
/**
* #brief called from a stream's read or write function
* #param in the input buffer
* #param out the output buffer
* #param startPosition the current stream position (obtained via the streams
* tellg or tellp functions for read and write respectively)
* #param length the number of bytes to transform
*/
void transform(char *in, char *out,
std::ios_base::streamoff startPosition,
long length)
{
// need to reset sbox from member s_box each time this
// function is called
long sbox[256];
for (int i = 0; i<256; ++i) {
sbox[i]=m_sbox[i];
}
// ***THIS BIT***
// need to run the swap operation startPosition times
// to get sbox integer sequence in order
int i = 0, j = 0, k = 0;
for (int a=0; a < startPosition; ++a) {
i = (i + 1) % 256;
j = (j + sbox[i]) % 256;
swapints(sbox, i, j);
}
// now do the actual xoring process up to the length
// of how many bytes are being read or written
for (int a=0; a < length; ++a) {
i = (i + 1) % 256;
j = (j + sbox[i]) % 256;
swapints(sbox, i, j);
k = sbox[(sbox[i] + sbox[j]) % 256];
out[a] = in[a] ^ k;
}
}
and then the transform would be called from the read or write of the stream implementation, something like:
MyStream&
MyStream::read(char * const buf, std::streamsize const n)
{
std::ios_base::streamoff start = m_stream.tellg();
std::vector<char> in;
in.resize(n);
(void)m_stream.read(&in.front(), n);
m_byteTransformer->transform(&in.front(), buf, start, n);
return *this;
}
EDIT: the stream should have no knowledge of how the transformation function works. The transformation function is completely independent and I should be able to freely swap in different transformation implementations.
EDIT: the function swapints looks like this:
void swapints(long *array, long ndx1, long ndx2)
{
int temp = array[ndx1];
array[ndx1] = array[ndx2];
array[ndx2] = temp;
}
The real problem with the above transform function is in its slowness because it has to perform startPosition initial swap operations before the xor transformation-proper is performed. This is very problematic when many seek operations are performed. Now I've heard that RC4 is meant to be quick but my (probably bad implementation) suggests otherwise given the initial set of swap operations.
So my real question is: how can the above code be optimized to reduce the number of required operations? Ideally I would like to eliminate the initial ("THIS BIT") set of swap operations
EDIT: optimizing out the initial sbox setting is probably trivial (e.g. using memcpy as suggested by egur). The important optimization I think is how I can optimize out the loop marked by THIS BIT. Perhaps all those swap ints can be programmed more concisely without the need for a for-loop.
Thanks,
Ben
Change all % 255 to & 0xff, much faster:
i = (i + 1) % 256;
To:
i = (i + 1) & 0xFF;
Edit:
You're wasting a lot of time initializing sbox. You should pass sbox as a parameter to the transform function so the original copy is updated between calls. What you're doing now is initializing it again and again and every time it will take longer since startPosition grows.
void transform(char *in, char *out,
long length,
unsigned char* sbox)
The temporary sbox should be a member of the MyStream class. The read function should be:
MyStream&
MyStream::read(char * const buf, std::streamsize const n)
{
std::ios_base::streamoff start = m_stream.tellg();
std::vector<char> in;
in.resize(n);
(void)m_stream.read(&in.front(), n);
// init m_TempSbox on first call
if (m_FirstCall) {
initTempSbox();
}
m_byteTransformer->transform(&in.front(), buf, n, m_TempSbox);
return *this;
}
After some research, it turns out that random access of RC4's key-stream is not possible. See discussion at this link: crypto.stackexchange. A better alternative (as pointed out by Rossum in his comment) is to use a block cipher in counter mode.
What you do in counter mode is to encrypt a sequence of numbers. This sequence is incremental and is the length of the entire stream of data. So, say you want to encrypt 8 bytes of data starting at position '16' of the original data stream using a 64 bit (8 bytes) block cipher.
8 bytes need to be enciphered since you operate over 8-bytes of plain text at a time. Since the position we want to randomly offset to is 16, we essentially encrypt 'block 3' of this number sequence (bytes 0 to 7 == block 1, bytes 8 to 15 == block 2, bytes 16 to 23 == block 3 and so on...)
For example using the XTEA algorithm which encrypts blocks of 8 bytes using a 128 bit key, we'd do something like:
Block 3:
// create a plain text number sequence
uint8_t plainText[8];
plainText[0] = 16;
plainText[1] = 17;
.
.
.
plainText[7] = 23;
// encrypt the number sequence
uint8_t cipherText[8];
applyXTEATransformation(plainText, cipherText, keyOfLength128Bit);
// use the encrypted number sequence as a
// key stream on the data to be encrypted
transformedData[16] = dataToBeEncrypted[16] ^ cipherText[0];
transformedData[17] = dataToBeEncrypted[17] ^ cipherText[1];
.
.
.
transformedData[23] = dataToBeEncrypted[23] ^ cipherText[7];
tldr: I wanted to do random access on RC4 but discovered it isn't possible so used counter mode on an XTEA block cipher instead.
Ben

Calculate_CRC32 function. How do I convert it to calculating bytes and not bits

I am helping out a friend of mine who is a bit stuck and my own c++ skills are very rusty. My interest and curiosity is quite picked by this. so I shall try and explain this as best I can. Note its a 32 bit check.
uint32_t CRC32::calculate_CRC32(const uint32_t* plData, uint32_t lLength, uint32_t previousCrc32)
{
uint32_t lCount;
const uint32_t lPolynomial = 0x04C11DB7;
uint32_t lCrc = previousCrc32;
unsigned char* plCurrent = (unsigned char*) plData;
lCrc ^= *plCurrent++;
while (lLength-- != 0)
{
for (lCount = 0 ; lCount < lLength; lCount++)
{
if (lCrc & 1)
lCrc = (lCrc >> 8) ^ lPolynomial;
else
lCrc = lCrc >> 8;
}
}
return lCrc;
}
Now ILength is the number of bytes that the packet contains. plData is the packet for which data needs to be checked. As it is, the function works. But it works bit for bit. It needs to be improved to work byte for byte. So to all genius c++ developers out there who far surpasses my knowledge. Any ideas will be really helpful. Thanks in advance guys.
Read Ross Williams excellent tutorial on CRCs, especially section 9 on "A Table-Driven Implementation", which calculates the CRC a byte at a time instead of a bit at a time. You can also look at the somewhat more involved CRC implementation in zlib, which calculates it four bytes at a time. You can also calculate it eight bytes at a time.

checksum calculation

To calculate CRC I found a piece of code but I am not understanding the concept.
Here is the code:
count =128 and ptr=some value;
calcrc(unsigned char *ptr, int count)
{
unsigned short crc;
unsigned char i;
crc = 0;
while (--count >= 0)
{
crc = crc ^ (unsigned short)*ptr++ << 8;
i = 8;
do
{
if (crc & 0x8000)
crc = crc << 1 ^ 0x1021;
else
crc = crc << 1;
} while(--i);
}
return (crc);
}
Please any body explain and tell me the logic.
This looks like a CRC (specifically it looks like CRC-16-CCITT, used by things like 802.15.4, X.25, V.41, CDMA, Bluetooth, XMODEM, HDLC, PPP and IrDA). You might want to read up on the CRC theory on the linked-to Wikipedia page, to gain some more insight. Or you can view this as a "black box" that just solves the problem of computing a checksum.
You will probably need to know that in C, the ^ operator is a bitwise XOR operator and the << operator is the left shift operator (equivalent to multiplication by 2 to the power of the number on the right of the operator). Also the crc & 0x8000 expression is testing for the most significant bit set of the variable crc.
This will help you to work out a low level explanation of what is occurring when this runs, for a high level explanation of what a CRC is and why you might need it, read the Wikipedia page or How Stuff Works.
One famous text on CRCs is "A Painless Guide to CRC Error Detection Algorithms" by Ross Williams. It takes some time to absorb but it's pretty thorough.
Take a look at my answer to
How could I guess a checksum algorithm?