Decoding an unknown CRC or checksum?

Decoding an unknown CRC or checksum? - crc

I've been trying decode the CRC or checksum algorithm that is being used for the serial communication between a drone and its camera for about a week without a lot of luck and I was wondering if anybody here sees something I am missing or has any suggestions.
A typical packet looks like this:
FE1A390100020001AE0BE0FF090046250B00040000004E0D32080008540D8808F4016B54
They always start with 0xFE. The 2nd byte is the total size of the packet minus 10 bytes. The packet sizes vary, but I think I am specifically interested the 0x1A size. Byte 3 seems to be a packet counter because it usually increases by 1, but sometimes I have seen it jump to a completely different number for a few packets (usually when changing to a 0x22 size packet) before resuming the increment by 1 sequence. The last 2 bytes always change and I believe are the checksum or CRC. All the rest of the bytes seem to stay the same from one 0x1A packet to the next unless I manipulate the drones radio controls.
Right after powering up there is a series of packets that I assume is for initializing the communication. They are the shortest packets and have the least amount of change between them so it seems like they might be the easyiest to look at. Here are the first 7 bytes sent after powering it on.
From Drone to camera
Time:
8.3982205 FE030001000000010200018F68
8.39934725 FE03010100000001020001A844
8.400473958 FE03020100000001020001C130
8.401600708 FE050301000000000000000001AAE8
8.402900792 FE1A040100020001000000000000000000000C000300000853060008AB028808F4014629
8.406020958 FE22050100030002000000000000000000000000000000000000B3FFFFFFDE22006300FF615110050000C956
8.4098345 FE1A060100020001000000000000000000000C000300000853060008AB028808F40180A9
If I put the first 3 packets into reveng with -w 16 -s then it comes back with:
reveng: warning: you have only given 3 samples
reveng: warning: to reduce false positives, give 4 or more samples
width=16 poly=0x1487 init=0x0334 refin=false refout=false xorout=0x0000 check=0xa5b9 residue=0x0000 name=(none)
If i add the 4th packet it finds the same poly, but there rest of it looks differnt:
width=16 poly=0x1487 init=0x417d refin=false refout=false xorout=0x5582 check=0xbfa2 residue=0xb059 name=(none)
If i add the 5th packet reveng comes back with no model found.
However, if I remove packet 4 and then run it with packets, 1,2,3 and 5 if finds the same poly again, but different values for the rest:
width=16 poly=0x1487 init=0x804b refin=false refout=false xorout=0x0138 check=0x7dcc residue=0xc8ca name=(none)
Most combinations of packets containing a 0x1A size packet and the first 3 initialization packets that I run through reveng come back with 'no model found'. So far every time I have run reveng with only 0x1a sized packets has failed to find a model.
I think it is possible that after the initialization packets it some how incorporates info it receives from the camera to the drone into the CRC calculation for the data going from the drone to the camera, but there isn't a lot of data in those packets. Here are the first 9 packets that are sent from the camera to the drone. Prior to the first 0x1A packet being sent from the drone, the only data sent from the camera seems to be 0x7D0001.
From camera to drone:
Time
3.474456792 FE0500020000000000007D00013D40
4.475220208 FE0501020000000000007D000168C5
5.476483875 FE0502020000000000007D00018642
6.477295958 FE0503020000000000007D0001D3C7
7.4783405 FE0504020000000000007D00014B45
8.479420458 FE06050200010003FA078538B838B3
8.480811667 FE0506020000000000007D0001F047
9.48057875 FE0507020000000000007D0001A5C2
9.481883 FE06080200010003F9078638B8386037
I have tried incorporating 0x7D0001 into the packets and running them through reveng, but that didn't seem to help.
I have also tried reveng -w 8 -s on various combinations of packets without finding a model. And I have tried various checksum algos manually (possibly incorrectly) without success.
I have a bunch more data that I have captured here:
https://drive.google.com/open?id=1v8MCaXOvP_2Wv_hcaqhUZnXvqNI1_2Ur
Any ideas? Suggestions? This has been driving me nuts for a week

Related

How to determine CRC16 initial checksum so resulting checksum is zero

Working on a SPI communication bus between an array of SAMD MCUs.
I have an incoming packet which is something like { 0x00, 0xFF, 0x00, 0xFF }.
The receiver chip performs CRC16 check on the incoming packet.
Since I am expecting the exact same packet every time, I want to have zero CRC checksum when the packet is valid and not zero checksum when there is a transfer error.
I know that I can add the calculated CRC16 to the end of the packet when sending it and on the receiver side the CRC check will output 0, but in this case it is impossible to add a CRC16 checksum to the packet since the packet is constructed by multiple sender chips on the SPI line and each chip only fills its own two bytes from the entire packet.
I need to load an initial CRC checksum on the receiver side, so after the incoming packet is checked, the resulting CRC equals to zero (if packet is intact).
The answer here on SO is actually what I am looking for, but it is for CRC32 format and I don't actually understand the principle of the code, so I can't rewrite if for CRC16 format.
Any help would be greatly appreciated!
Regards,
Niko

The solution is simply to use a look-up table based CRC. If you can't append the checksum (aka the Frame Check Sequence, FCS) to the package, then do the table look-up first and then simply compare that one against the expected sequence for your fixed data.
Please note that "CRC 16" could mean anything, there are multiple versions and (non)standards. The most common one is perhaps the one called "CRC-16-CCITT" with 1021h poly and initial value FFFFh, but even for that one there's multiple algorithms out there - some are correct, some are broken. Your biggest challenge will be to find a trustworthy CRC algorithm.
However, I actually think SAMD specifically uses hardware-generated CRC-16-CCITT on-chip, for DMA purposes. Since this is SPI, it should be DMA-able, so perhaps investigate if you can use that one somehow.

I found a solution, thanks to the advice of Bastian Molkenthin, who did this great online CRC calculator.
He advised trying a brute force calculation of all the 2^16 values of a CRC16 initial value. Indeed, after a few lines of code and few microseconds later the SAMD51 found an initial value, which matches a zero CRC value for the given buffer.

Identifying CRC-16 algorithm used

I'm trying to communicate with a device over a serial communication protocol and having some trouble finding out about which checksum/crc algorithm that is used for the last 2 bytes of the messages. I've tried several CRC16 algorithms in various online crc utilities, like:
http://www.sunshine2k.de/coding/javascript/crc/crc_js.html
http://www.zorc.breitbandkatze.de/crc.html
I've also tried reverse engineering, with the help of REVENG software, but it only gives some occasional random hits (depending on which of the examples from the captured messages I try together), that does not seem to be a correct algorithm that matches all examples.
I have not found any documentation of the device, which can indicate the CRC16 algorithm used or if its some other variant like the lowest bytes of a CRC32.
Below are 2 types of messages each with some different examples and variations. The first 4 bytes of the message tells the renaming number of bytes of the message. Most probably these 4 first bytes should not be included in the CRC calculation, but that is just a guess. What I believe is a 16 wide CRC is the last 2 bytes of each message.
Message type 1 (examples):
0000000908100300180a4621a8
0000000901100300180a463a11
0000000909100300180a461f26
0000000902100300180a4649f9
000000090a100300180a466cce
0000000903100300180a46fb58
000000090b100300180a46de6f
Message type 2 (examples):
0000001f09131900180a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a0a7be0
0000001f0913190018141414141414141414141414141414141414141414141414f3a5
0000001f09131900181e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e1e3d38
0000001f0913190018282828282828282828282828282828282828282828282828e82f
0000001f09131900183232323232323232323232323232323232323232323232321762
0000001f001319001814ffffffffffffffffffffffffffffffffffffffffffffff3d16
0000001f00131900181effffffffffffffffffffffffffffffffffffffffffffff2e93
0000001f001319001828ffffffffffffffffffffffffffffffffffffffffffffff3438
0000001f00131900185fffffffffffffffffffffffffffffffffffffffffffffffac2b
Anyone out there with some knowledge about CRC that can point me in the right direction to figure this out?

It's not a CRC-16. I ran a simple brute-force search.

Some more messages where only one or two of the last "data" bytes are varying that show some sort of more logical behaviour on the last 2 checksum bytes, but still no clear view of how the algorithm is constructed for the calculation.
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF065B7C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF075A7C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF08557C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF09547C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0A577C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0B567C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0C517C
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF06FFA285
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF07FFA284
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF08FFA28B
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF09FFA28A
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0AFFA289
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0BFFA288
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0CFFA28F
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0DFFA28E
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF06065B85
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF07075A84
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0808558B
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0909548A
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0A0A5789
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0B0B5688
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0C0C518F
0113190018FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF0D0D508E
Full set of 1200 messages at: http://dpaste.com/0DSMZJV

RTP timestamp not linear?

I was trying to reconstruct an audio conversation (a-b call using g711 audio) using the rtp time-stamp. I used to fill silence using difference of two rtp time-stamp and sampling rate. The conversation went out of sync and then I see that rtp time-stamp is not linear.I was not able to get exact clock time using rtp time-stamp and resulted in sync issues. How do i calculate the exact time.

I have the same problem with a Stream provided by GStreamer, whic doesnt provide monotonic timestamps.
for Example: The Difference between the stamps should bei exactly 1920, but it is between ~120 and ~3500, but in average 1920.
The problem here is that there is no way to find missing samples, because you never know if the high difference is from the Encoder delay or from a sample missing.
If you have only Audio to decode, I would try to put "valid" PTS values to each sample (in my case basetime+1920, basetime+3840 and so on.)
The big problem here comes when video AND audio were combined. Here this trick doesnt work well, when samples are missing and there is no way to find out when this is the case :(

when you want to send rtp you should notice about two things:
the time stamp is incremented due to the amout of byte sents.
e.g for PT=10, you may have this pattern:
1160 byte , time stamp increment: 1154 and wait 26 ms
lets see how this calculation happens:
. number of packet should be sent in one second : 1/(26ms) = 38
time stamp increment : clockrate / # = 1154

Regarding to RFC3550 (https://www.ietf.org/rfc/rfc3550.txt)
The sampling instant MUST be derived from a clock that increments
monotonically
Its not a choice nor an option. By the way please read the full description of the timestamp field of the RTP packet, there I found it also:
As an example, for fixed-rate audio
the timestamp clock would likely increment by one for each
sampling period. If an audio application reads blocks covering
160 sampling periods from the input device, the timestamp would be
increased by 160 for each such block, regardless of whether the
block is transmitted in a packet or dropped as silent.
If you want to check linearity then use the RTCP SR RTP and NTP timestamps field. At the SR report the RTP timestamp belongs to the NTP timestamp.
So the difference of two consecutive RTP timestamp (lets call them dRTPt_1, dRTP_2, ...) and the difference of two consecutive NTP timestamps (lets call them dNTP_1, dNTP_2, ...) and then multiply dRTP_t* with clock rate and check weather you get dNTP_t*.
But first please read the RFC.

Why do I get an error when read or write more than 3 bytes using libusb to communicate with a PIC 18F2550?

I'm using libusb in Qt to communicate with a PIC microcontroller, 18F2550. The thing is that it's working OK until I try to send or read more than three bytes. Why does it happen?
I've tried using bulk_read transfer and interrupt_read. When I put the size of the buffer equal or less than three, then the transmission works perfectly, using bulk or interrupt. When this size is greater than three, then I'm getting buffer1 and buffer[2] OK, but the rest are wrong.
The error that I'm getting is from timeout. As input I'm using endpoint 0x81.
More information:
The return value from the bulk or interrupt read is -116. The numbers that I'm sending from the PIC to the PC in the two first bytes ([0] and 1) in hex is 0x02D6. With this number, buffer[0] = -42 (when it should be 0xD6 = 214) and buffer[1] = 2 that is correct.
In the [2] and [3] bytes the number is 0x033D, and I get [2] = 61 = 0x3D. That is correct and [3] = -42??? (like [0]).
And the fifth byte is 1, and the SW shows 2???. Might it be a problem in the microcontroller, because I'm programming it as an HID USB?

I don't think that being a HID is the problem. I had a similar issue before; the PIC would randomly timeout when large data was being transmitted. It turned out to be some voltage fluctuation on the MCU. How are you connecting the crystal? Do you have a capacitor on VUSB to regulate it?
Building a PIC18F USB device is a great tutorial on building a PIC HID, and even though it's not based on 18F2550 but on 18F4550, it should be quite similar, and I'm sure you can get a lot out of the schematics and hardware setup. It was the starting point for my PIC-USB projects.

Can a false sync word be found in the payload of an MPEG-1/MPEG-2 frame?

I know I can find other answers about this on SO, but I want clarifications from somebody who really knows MPEG-1/MPEG-2 (or MP3, obviously).
The start of an MPEG-1/2 frame is 12 set bits starting at a byte boundary, so bytes ff f*, where * is any nibble. Those 12 bits are called a sync word. This is a useful characteristic to find the start of a frame in any MPEG-1/2 stream.
My first question is: formally, can a false sync word be found or not in the payload of an MPEG-1/2 frame, outside its header?
If so, here's my second question: why does the sync word mechanism even exist then? If we cannot make sure that we found a new frame when reading fff, what is the purpose of this sync word?
Please do not even consider ID3 in your answer; I already know about sync words that can be found in ID3v2 payloads, but that's well documented.

I worked on MPEG-2 streams, more precisely Transport Streams (TS): I guess we can find similarities.
A TS is composed of Transport Packets, which have a header, starting with a sync byte 0x47.
We also can found 0x47 within the payload of the TP, but we know that it is not a sync byte because it is not aligned (TP have a fixed size of 188 bytes).
The sync word gives an entry point to someone that looks at the stream, and allows a program to synchronize his process with the stream, hence the name.
It also allows a fast browsing and parsing of the stream: in a TS you can jump from a packet to another (inspect header, check sync byte, skip 188 bytes and so on)
Finally it is a safety measure that helps you to spot errors (in the stream during transmission for example or in the process if a bug caused a bad alignment)
These argument are about TS but I think the same goes with your case : finding a sync word within a payload should not be an issue because you should always able to distinguish payload and header, most of the time with a length information (either because the size is fixed like in TP or because you have a TLV format).

can a false sync word be found or not in the payload of an MPEG-1/2
frame, outside its header?
According to this, "frame sync can be easily (and very frequently) found in any binary file." See the section titled "MPEG Audio Frame Header"
I confirmed this with an .mp3 song that I chose at random (stripped of ID3 tags). It had 5193 sync words, of which only 4898 were found to be valid (using code too long to be included here).
>>> f = open('notag.mp3', 'rb')
>>> r=f.read()
>>> r.count(b'\xff\xfb')
5193
why does the sync word mechanism even exist then? If we cannot make
sure that we found a new frame when reading fff, what is the purpose
of this sync word?
We can be (relatively) sure if we are checking the rest of the frame header, and not just the sync word. There are bits following the sync which can be used to:
identify a false positive or
give you useful info
With .mp3, you have to use those useful bits to calculate the size of the frame. By skipping ahead <frame-size> bytes before looking for the next sync word, you avoid any false syncs that may be present in the payload. See the section titled "How to calculate frame length" in that same link.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js