Reading bytes from a binary file

Reading bytes from a binary file - c++

I have got a binary file and i need to store from ( x, y ) position z amount of bytes. For example I have got this sequence of bytes :
00000000 49 49 49 49 05 00 00 00 08 00 00 00 1a 00 00 00 | y0
00000010 39 a6 82 f8 47 8b b8 10 78 97 f1 73 56 d9 6f 00 | y1
00000020 58 99 d5 3b ac 7b 7b 40 b6 2e 9f 0a 69 b2 ac a0 | y2
________________________________________________________________
x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15
Every 2 merged numbers represents 1 byte ( it's taken from hexdump -C - coding is a little endian ). 49 = 1 byte, f8 is 1 byte etc ...
( x , y ) means position. For example if i put for x = 2,y = 2 i get position ( 2, 2 ) that means i start reading bytes from position y2, x2. In this case i start at byte d5 If i put z = 3 that means i want to store 3 bytes in this case those bytes are d5, 3b, ac.
I can calculate the position by a simple formula :
position = 16 * y + x
position = 16 * 2 + 2 // i put x = 2, y = 2 to formula
position = 34 // get number 34, that means i will start storing at 35th byte in this case it's d5
binaryFile . seekg ( position ) // jump to position in a file ( itc it's 99 )
binaryFile . read ( ( char * )&dest, z )) // save z bytes
If i put z = 3 i will store 3 bytes : d5, 3b, ac.
But somtimes coefficient z , x , y are not integers:
If i put y = 2, x = 1,5 and z = 3 // ( 1,5, 2 ) That means i have to jump at byte not 99 but d5 then store 2 bytes in this case d5 , 3b and add to them half byte from byte 99 and half byte from byte ac because the starting position was x = 1,5. How could I do that ??

You have to extend to byte boundaries on both ends, and first read the area that you want to write. So, if you want to write two bytes, you will have to read three bytes.
Then you will have to do appropriate bit shifting and masking to get your bits in the right places.
For example, if you are going to write two bytes shifted ½ byte, you would start with something like this:
unsigned char *mydata = MyDataToWrite();
unsigned char temp[bigEnough];
binaryFile.input(temp, 3);
temp[0] = (temp[0] & 0xf0) | (mydata[0] >> 4);
// more code here to put bits in temp
binaryFile.output(temp, 3);
Once you have your data in temp, write the 3 bytes back to the same location from which the were read.
I'm not going to write the whole thing here, but I hope this gives you an idea you can work from

Related

CRC-8 value mismatch

I'm trying to understand how a protocol works, it's from a TEC-Microsystem device (DX5100), it says:
CRC: Byte of the control sum CRC-8. It can be absent in some options
of the protocol. The control sum CRC-8 is calculated before the
stuffing for the entire packet, beginning with the byte FEND and
finishing with the last databyte. If a packet transmits an address,
when calculating the control sum, its true value is used, i.e. MSB=1
is not taken into account. For the calculation of the control sum the
polynomial is used. CRC = X8 + X5 + X4 + 1.
When I sniff the data being sent by their software, I see this data being transmitted:
0xC0 0x81 0x04 0x02 0x02 0x00 0x55
If a packet transmits an address,
when calculating the control sum, its true value is used, i.e. MSB=1
is not taken into account
This means that the data taken into account to compute the CRC is actually 0xC0 0x01 0x04 0x02 0x02 0x00 (second byte is 0x01 instead of 0x81).
According to what I could find on wikipedia, "CRC = X8 + X5 + X4 + 1" means they use "CRC-8-Dallas/Maxim".
However, when I use https://crccalc.com/, enter C00104020200 and hit "CALC-CRC-8" it reports 0x82 for "CRC-8/MAXIM", not 0x55. Am I missing something?
More examples from the sniffer:
C0 81 03 02 02 00 D3, so C0 01 03 02 02 00 CRC is D3
C0 81 05 02 02 00 DA, so C0 01 05 02 02 00 CRC is DA

With two examples, you can XOR them, which eliminates initial value and final xor, as if both were 00:
C0 01 03 02 02 00 CRC is D3
C0 01 05 02 02 00 CRC is DA
---------------------------
00 00 06 00 00 00 CRC is 09
This confirms that the CRC polynomial is 0x31 (reversed to 0x8C), input reflected, result reflected.
Using initial value 0xDE didn't work, so I tried reversing the bits to 0x7B which works for the three examples in the question. So initial value == 0x7B, the polynomial will also be bit reversed from 0x31 to 0x8C, but the online calculator uses the non-reversed polynomial, 0x31. If you click on "show reflected lookup table", calculate CRC, then look at row 8 byte 0, you will see the 0x8C.

c++ replacing order of items in byte array

I'm writing a code for Arduino C++.
I have a byte array with hex byte values, for example:
20 32 36 20 E0 EC 20 F9 F0 E9 E9 E3 F8 5C 70 5C 70 5C 73 20 E3 E2 EC 20 F8 E0 E5 E1 EF 20 39 31 5C
There are four ASCII digits in these bytes:
HEX 0x32 is number 2 in ascii code
HEX 0x35 is number 5 in ascii code
HEX 0x39 is number 9 in ascii code
and so on....
https://www.ascii-codes.com/cp862.html
So the hex values 32, 36 represent the number 26, and 39, 31 represent 91.
I want to find these numbers and reverse each group, so that (in this example) 62 and 19 are represented instead of 26 and 91.
The output would thus have to look like this:
20 36 32 20 E0 EC 20 F9 F0 E9 E9 E3 F8 5C 70 5C 70 5C 73 20 E3 E2 EC 20 F8 E0 E5 E1 EF 20 31 39 5C
The numbers don't have to be two digits but could be anything in 0-1000
I also know that each group of such numbers is preceded by the hex value 20, if that helps.
I have done this in C# (with some help of Stack overflow users :-) ):
string result = Regex.Replace(HexMessage1,
#"(?<=20\-)3[0-9](\-3[0-9])*(?=\-20)",
match => string.Join("-", Transform(match.Value.Split('-'))));
private static IEnumerable<string> Transform(string[] items)
{
// Either terse Linq:
// return items.Reverse();
// Or good old for loop:
string[] result = new string[items.Length];
for (int i = 0; i < items.Length; ++i)
result[i] = items[items.Length - i - 1];
return result;
}
Can someone help me make it work on C++?

Loop over the array, element by element, looking for 0x32 or 0x39. If found, check the next byte (if within bounds) to see if it matches 0x36 or 0x31 (respectively). If it does then swap the current and the next byte. Continue the loop, skipping over the current and the next byte.

Regex / Python3 - re.findall() - Find all occurrences between opcodes

Background
I'm reverse engineering a TCP stream that uses a Type-Length-Value approach to encoding data.
Example:
TCP Payload: b'0000001f001270622e416374696f6e4e6f74696679425243080310840718880e20901c'
---------------------------------------------------------------------------------------
Type: 00 00 # New function call
Length: 00 1f # Length of Value (Length of Function + Function + Data)
Value: 00 12 # Length of Function
Value: 70 62 2e 41 63 74 69 6f 6e 4e 6f 74 69 66 79 42 52 43 # Function ->(hex2ascii)-> pb.ActionNotifyBRC
Value: 08 03 10 84 07 18 88 0e 20 90 1c # Data
However the Data is a data object that can include multiple variables with variable data lengths.
Data: 08 05 10 04 10 64 18 c8 01 20 ef 0f
----------------------------------------------
Opcode : Value
08 : 05 # var1 : 1 byte
10 : 04 # var2 : 1 byte
18 : c8 01 # var3 : 1-10 bytes
20 : ef 0f # var4 : 1-10 bytes
Currently I am parsing the Data using the following Python3 code:
############################### NOTES ###############################
# Opcodes sometimes rotate starting positions but the general order is always held:
# Data: 20 ef 0f 08 05 10 04 10 64 18 c8 01
#####################################################################
import re
import binascii
def dataVariable(data, start, end):
p = re.compile(start + b'(.*?)' + end)
return p.findall(data + data)
data = bytearray.fromhex('08051004106418c80120ef0f')
var3 = dataVariable(data, b'\x18', b'\x20')
print("Variable 3:", end=' ')
for item in set(var3):
print(binascii.hexlify(item), end=' ')
----------------------------------------------------------------------------
[Output]: Variable 3: b'c801'
So far all good...
Problem
If an Opcode appears in the previous variables Value the code is no longer reliable.
Data: 08 05 10 04 10 64 18 c8 20 01 20 ef 0f
----------------------------------------------
Opcode : Value
08 : 05
10 : 04
18 : c8 20 01 # The Value includes the next opcode (20)
20 : ef 0f
----------------------------------------------------------------------------
[Output]: Variable 3: b'c8'
[Output]: Variable 4: b'0120ef0f'
I was expecting an output of:
[Output]: Variable 3: b'c8' b'c82001'
[Output]: Variable 4: b'0120ef0f' b'ef0f'
It seems like there is an issue with my regular expression?
Update
To further clarify, var3 and var4 are representing integers.
I have managed to figure out how the length of the Value was being encoded. The most significant bit was being used as a flag to inform me that another byte was coming. You can then strip the MSB of each byte, swap the endianness and convert to decimal.
data -> binary representation -> strip MSB and swap endianness -> decimal representation
ac d7 05 -> 10101100 11010111 00000101 -> 0001 01101011 10101100 -> 93100
e4 a6 04 -> 11100100 10100110 00000100 -> 0001 00010011 01100100 -> 70500
90 e1 02 -> 10010000 11100001 00000010 -> 10110000 10010000 -> 45200
dc 24 -> 11011100 00100100 -> 00010010 01011100 -> 4700
f0 60 -> 11110000 01100000 -> 00110000 01110000 -> 12400

You may use
def dataVariable(data, start, end):
p = re.compile(b'(?=(' + start + b'.*' + end + b'))')
res = []
for x in p.findall(data):
cur = b''
for i, m in enumerate([x[i:i+1] for i in range(len(x))]):
if i == 0:
continue
if m == end and cur:
res.append(cur)
cur = cur + m
return res
See the Python demo:
data = bytearray.fromhex('08051004106418c8200120ef0f0f') # => b'c82001' b'c8'
#data = bytearray.fromhex('185618205720') # => b'56182057' b'2057' b'5618'
var3 = dataVariable(data, b'\x18', b'\x20')
print("Variable 3:", end=' ')
for item in set(var3):
print(binascii.hexlify(item), end=' ')
Output is Variable 3: b'c8' b'c82001' for '08051004106418c8200120ef0f0f' string and b'56182057' b'2057' b'5618' for 185618205720 input.
The pattern is of (?=(...)) type to find all overlapping matches. If you do not need the overlapping feature, remove these parts from the regex.
The point here is:
match all substrings starting with start and up to the last end with start + b'.*' + end pattern
iterate through the match dropping the first start byte and add an item to the resulting list when the end byte is found, adding up found bytes at each iteration (thus, getting all inner substrings inside the match).

How to convert ECDSA DER encoded signature data to microsoft CNG supported format?

I am preparing a minidriver to perform sign in smartcard using NCryptSignHash function of Microsoft CNG.
When I perform sign with an SECP521R1 EC key in smartcard it generates a sign data with length of 139 as ECC signed data format:
ECDSASignature ::= SEQUENCE {
r INTEGER,
s INTEGER
}
Sample signed data is
308188024201A2001E9C0151C55BCA188F201020A84180B339E61EDE61F6EAD0B277321CAB81C87DAFC2AC65D542D0D0B01C3C5E25E9209C47CFDDFD5BBCAFA0D2AF2E7FD86701024200C103E534BD1378D8B6F5652FB058F7D5045615DCD940462ED0F923073076EF581210D0DD95BF2891358F5F743DB2EC009A0608CEFAA9A40AF41718881D0A26A7F4
But when I perform Sign using MS_KEY_STORAGE_PROVIDER it generates a sign with length of 132 byte.
What is the procedure to reduce the sign data size from 139 to 132?

Your input is an X9.62 signature format which is a SEQUENCE containing two ASN.1 / DER encoded signatures. These integers are variable sized, signed, big endian numbers. They are encoded in the minimum number of bytes. This means that the size of the encoding can vary.
The 139 bytes is common because it assumes the maximum size of the encoding for r and s. These values are computed using modular arithmetic and they can therefore contain any number of bits, up to the number of bits of order n, which is the same as the key size, 521 bits.
The 132 bytes are specified by ISO/IEC 7816-8 / IEEE P1363 which is a standard that deals with signatures for smart cards. The signature consists of the concatenation of r and s, where r and s are encoded as the minimum number of bytes to display a value of the same size as the order, in bytes. The r and s are statically sized, unsigned, big endian numbers.
The calculation of the number of bytes of r or s is ceil((double) n / 8) or (n + 8 - 1) / 8 where 8 is the number of bits in a byte. So if the elliptic curve is 521 bits then the resulting size is 66 bytes, and together they therefore consume 132 bytes.
Now on to the decoding. There are multiple ways of handling this: perform a full ASN.1 parse, obtain the integers and then encode them back again in the ISO 7816-8 form is the most logical one.
However, you can also see that you could simply copy bytes as r and s will always be non-negative (and thus unsigned) and big endian. So you just need to compensate for the size. Otherwise the only hard part is to be able to decode the length of the components within the X9.62 structure.
Warning: code in C# instead of C++ as I expected the main .NET language; language not indicated in question when I wrote the main part of the answer.
class ConvertECDSASignature
{
private static int BYTE_SIZE_BITS = 8;
private static byte ASN1_SEQUENCE = 0x30;
private static byte ASN1_INTEGER = 0x02;
public static byte[] lightweightConvertSignatureFromX9_62ToISO7816_8(int orderInBits, byte[] x9_62)
{
int offset = 0;
if (x9_62[offset++] != ASN1_SEQUENCE)
{
throw new IllegalSignatureFormatException("Input is not a SEQUENCE");
}
int sequenceSize = parseLength(x9_62, offset, out offset);
int sequenceValueOffset = offset;
int nBytes = (orderInBits + BYTE_SIZE_BITS - 1) / BYTE_SIZE_BITS;
byte[] iso7816_8 = new byte[2 * nBytes];
// retrieve and copy r
if (x9_62[offset++] != ASN1_INTEGER)
{
throw new IllegalSignatureFormatException("Input is not an INTEGER");
}
int rSize = parseLength(x9_62, offset, out offset);
copyToStatic(x9_62, offset, rSize, iso7816_8, 0, nBytes);
offset += rSize;
// --- retrieve and copy s
if (x9_62[offset++] != ASN1_INTEGER)
{
throw new IllegalSignatureFormatException("Input is not an INTEGER");
}
int sSize = parseLength(x9_62, offset, out offset);
copyToStatic(x9_62, offset, sSize, iso7816_8, nBytes, nBytes);
offset += sSize;
if (offset != sequenceValueOffset + sequenceSize)
{
throw new IllegalSignatureFormatException("SEQUENCE is either too small or too large for the encoding of r and s");
}
return iso7816_8;
}
/**
* Copies an variable sized, signed, big endian number to an array as static sized, unsigned, big endian number.
* Assumes that the iso7816_8 buffer is zeroized from the iso7816_8Offset for nBytes.
*/
private static void copyToStatic(byte[] sint, int sintOffset, int sintSize, byte[] iso7816_8, int iso7816_8Offset, int nBytes)
{
// if the integer starts with zero, then skip it
if (sint[sintOffset] == 0x00)
{
sintOffset++;
sintSize--;
}
// after skipping the zero byte then the integer must fit
if (sintSize > nBytes)
{
throw new IllegalSignatureFormatException("Number format of r or s too large");
}
// copy it into the right place
Array.Copy(sint, sintOffset, iso7816_8, iso7816_8Offset + nBytes - sintSize, sintSize);
}
/*
* Standalone BER decoding of length value, up to 2^31 -1.
*/
private static int parseLength(byte[] input, int startOffset, out int offset)
{
offset = startOffset;
byte l1 = input[offset++];
// --- return value of single byte length encoding
if (l1 < 0x80)
{
return l1;
}
// otherwise the first byte of the length specifies the number of encoding bytes that follows
int end = offset + l1 & 0x7F;
uint result = 0;
// --- skip leftmost zero bytes (for BER)
while (offset < end)
{
if (input[offset] != 0x00)
{
break;
}
offset++;
}
// --- test against maximum value
if (end - offset > sizeof(uint))
{
throw new IllegalSignatureFormatException("Length of TLV is too large");
}
// --- parse multi byte length encoding
while (offset < end)
{
result = (result << BYTE_SIZE_BITS) ^ input[offset++];
}
// --- make sure that the uint isn't larger than an int can handle
if (result > Int32.MaxValue)
{
throw new IllegalSignatureFormatException("Length of TLV is too large");
}
// --- return multi byte length encoding
return (int) result;
}
}
Note that the code is somewhat permissive in the fact that it doesn't require the minimum length encoding for the SEQUENCE and INTEGER length encoding (which it should).
It also allows wrongly encoded INTEGER values that are unnecessarily left-padded with zero bytes.
Neither of these issues should break the security of the algorithm but other libraries may and should be less permissive.

What is the procedure to reduce the sign data size from 139 to 132?
You have an ASN.1 encoded signature (shown below). It is used by Java, OpenSSL and some other libraries. You need the signature in P1363 format, which is a concatenation of r || s, without the ASN.1 encoding. P1363 is used by Crypto++ and a few other libraries. (There's another common signature format, and that is OpenPGP).
For the concatenation of r || s, both r and s must be 66 bytes because of secp-521r1 field element size on an octet boundary. That means the procedure is, you have to strip the outer SEQUENCE, and then strip the two INTEGER, and then concatenate the values of the two integers.
Your formatted r || s signature using your sample data will be:
01 A2 00 1E ... 7F D8 67 01 || 00 C1 03 E5 ... 0A 26 A7 F4
Microsoft .Net 2.0 has ASN.1 classes that allow you to manipulate ASN.1 encoded data. See AsnEncodedData class.
$ echo 08188024201A2001E9C0151C55BCA188F201020A84180B339E61EDE61F6EAD0B277321CAB
81C87DAFC2AC65D542D0D0B01C3C5E25E9209C47CFDDFD5BBCAFA0D2AF2E7FD86701024200C103E5
34BD1378D8B6F5652FB058F7D5045615DCD940462ED0F923073076EF581210D0DD95BF2891358F5F
743DB2EC009A0608CEFAA9A40AF41718881D0A26A7F4 | xxd -r -p > signature.bin
$ dumpasn1 signature.bin
0 136: SEQUENCE {
3 66: INTEGER
: 01 A2 00 1E 9C 01 51 C5 5B CA 18 8F 20 10 20 A8
: 41 80 B3 39 E6 1E DE 61 F6 EA D0 B2 77 32 1C AB
: 81 C8 7D AF C2 AC 65 D5 42 D0 D0 B0 1C 3C 5E 25
: E9 20 9C 47 CF DD FD 5B BC AF A0 D2 AF 2E 7F D8
: 67 01
71 66: INTEGER
: 00 C1 03 E5 34 BD 13 78 D8 B6 F5 65 2F B0 58 F7
: D5 04 56 15 DC D9 40 46 2E D0 F9 23 07 30 76 EF
: 58 12 10 D0 DD 95 BF 28 91 35 8F 5F 74 3D B2 EC
: 00 9A 06 08 CE FA A9 A4 0A F4 17 18 88 1D 0A 26
: A7 F4
: }
0 warnings, 0 errors.
Another noteworthy item is, .Net uses the XML format detailed in RFC 3275, XML-Signature Syntax and Processing. It is a different format than ASN.1, P1363, OpenPGP, CNG and other libraries.
The ASN.1 to P1363 conversion is rather trivial. You can see an example using the Crypto++ library at ECDSA sign with BouncyCastle and verify with Crypto++.
You might find Cryptographic Interoperability: Digital Signatures on Code Project helpful.

Binary File interpretation

I am reading in a binary file (in c++). And the header is something like this (printed in hexadecimal)
43 27 41 1A 00 00 00 00 23 00 00 00 00 00 00 00 04 63 68 72 31 FFFFFFB4 01 00 00 04 63 68 72 32 FFFFFFEE FFFFFFB7
when printed out using:
std::cout << hex << (int)mem[c];
Is there an efficient way to store 23 which is the 9th byte(?) into an integer without using stringstream? Or is stringstream the best way?
Something like
int n= mem[8]
I want to store 23 in n not 35.

You did store 23 in n. You only see 35 because you are outputting it with a routine that converts it to decimal for display. If you could look at the binary data inside the computer, you would see that it is in fact a hex 23.
You will get the same result as if you did:
int n=0x23;
(What you might think you want is impossible. What number should be stored in n for 1E? The only corresponding number is 31, which is what you are getting.)

Do you mean you want to treat the value as binary-coded decimal? In that case, you could convert it using something like:
unsigned char bcd = mem[8];
unsigned char ones = bcd % 16;
unsigned char tens = bcd / 16;
if (ones > 9 || tens > 9) {
// handle error
}
int n = 10*tens + ones;

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Reading bytes from a binary file - c++

Related

CRC-8 value mismatch

c++ replacing order of items in byte array

Regex / Python3 - re.findall() - Find all occurrences between opcodes

How to convert ECDSA DER encoded signature data to microsoft CNG supported format?

Binary File interpretation

Categories

Resources