How is date encoded/stored in MySQL? - c++

I have to parse date from raw bytes I get from the database for my application on C++. I've found out that date in MySQL is 4 bytes and the last two are month and day respectively. But the first two bytes strangely encode the year, so if the date is 2002-08-30, the content will be 210, 15, 8, 31. If the date is 1996-12-22, the date will be stored as 204, 15, 12, 22.
Obviously, the first byte can't be bigger than 255, so I've checked year 2047 -- it's 255, 15, and 2048 -- it's 128, 16.
At first I thought that the key is binary operations, but I did not quite understand the logic:
2047: 0111 1111 1111
255: 0000 1111 1111
15: 0000 0000 1111
2048: 1000 0000 0000
128: 0000 1000 0000
16: 0000 0001 0000
Any idea?

It seems that the logic of encoding is to erase the most significant bit of the first number and to write the second number from this erased bit like this:
2002 from 210 and 15:
1101 0010 -> _101 0010;
0000 1111 + _101 0010 -> 0111 1101 0010
2048 from 128 and 16:
1000 0000 -> _000 0000
0001 0000 + _000 0000 -> 1000 0000 0000

We had the same issue and developed the following C++20 helper methods for production use with mysqlx (MySQL Connector/C++ 8.0 X DevAPI) to properly read DATE, DATETIME and TIMESTAMP fields:
#pragma once
#include <vector>
#include <cstddef>
#include <chrono>
#include <mysqlx/xdevapi.h>
namespace mysqlx {
static inline std::vector<uint64_t>
mysqlx_raw_as_u64_vector(const mysqlx::Value& in_value)
{
std::vector<uint64_t> out;
const auto bytes = in_value.getRawBytes();
auto ptr = reinterpret_cast<const std::byte*>(bytes.first);
auto end = reinterpret_cast<const std::byte*>(bytes.first) + bytes.second;
while (ptr != end) {
static constexpr std::byte carry_flag{0b1000'0000};
static constexpr std::byte value_mask{0b0111'1111};
uint64_t v = 0;
uint64_t shift = 0;
bool is_carry;
do {
auto byte = *ptr;
is_carry = (byte & carry_flag) == carry_flag;
v |= std::to_integer<uint64_t>(byte & value_mask) << shift;
++ptr;
shift += 7;
} while (is_carry && ptr != end && shift <= 63);
out.push_back(v);
}
return out;
}
static inline std::chrono::year_month_day
read_date(const mysqlx::Value& value)
{
const auto vector = mysqlx_raw_as_u64_vector(value);
if (vector.size() < 3)
throw std::out_of_range{"Value is not a valid DATE"};
return std::chrono::year{static_cast<int>(vector.at(0))} / static_cast<int>(vector.at(1)) / static_cast<int>(vector.at(2));
}
static inline std::chrono::system_clock::time_point
read_date_time(const mysqlx::Value& value)
{
const auto vector = mysqlx_raw_as_u64_vector(value);
if (vector.size() < 3)
throw std::out_of_range{"Value is not a valid DATETIME"};
auto ymd = std::chrono::year{static_cast<int>(vector.at(0))} / static_cast<int>(vector.at(1)) / static_cast<int>(vector.at(2));
auto sys_days = std::chrono::sys_days{ymd};
auto out = std::chrono::system_clock::time_point(sys_days);
auto it = vector.begin() + 2;
auto end = vector.end();
if (++it == end)
return out;
out += std::chrono::hours{*it};
if (++it == end)
return out;
out += std::chrono::minutes{*it};
if (++it == end)
return out;
out += std::chrono::seconds{*it};
if (++it == end)
return out;
out += std::chrono::microseconds{*it};
return out;
}
} //namespace
Which can then be used as follows:
auto row = table.select("datetime", "date").execute().fetchOne();
auto time_point = read_date_time(row[0]);
auto year_month_day = read_date(row[1]);
getBytes document links to ColumnMetaData document url.
ColumnMetaData document links to protobuf encoding url.
protobuf encoding url / Protocol Buffers Documentation Documentation say :
Base 128 Varints
Variable-width integers, or varints, are at the core of the wire
format. They allow encoding unsigned 64-bit integers using anywhere
between one and ten bytes, with small values using fewer bytes.
Each byte in the varint has a continuation bit that indicates if the
byte that follows it is part of the varint. This is the most
significant bit (MSB) of the byte (sometimes also called the sign
bit). The lower 7 bits are a payload; the resulting integer is built
by appending together the 7-bit payloads of its constituent bytes.
So, for example, here is the number 1, encoded as 01 – it’s a single
byte, so the MSB is not set:
0000 0001
^ msb
And here is 150, encoded as 9601 – this is a bit more complicated:
10010110 00000001
^ msb ^ msb
How do you figure out that this is 150? First you drop the MSB from
each byte, as this is just there to tell us whether we’ve reached the
end of the number (as you can see, it’s set in the first byte as there
is more than one byte in the varint). Then we concatenate the 7-bit
payloads, and interpret it as a little-endian, 64-bit unsigned
integer:
10010110 00000001 // Original inputs.
0010110 0000001 // Drop continuation bits.
0000001 0010110 // Put into little-endian order.
10010110 // Concatenate.
128 + 16 + 4 + 2 = 150 // Interpret as integer.
Because varints are so crucial to protocol buffers, in protoscope
syntax, we refer to them as plain integers. 150 is the same as 9601.

Based on what you provide, it seems to be N1 - 128 + N2 * 128.

Which version???
DATETIME used to be encoded in packed decimal (8 bytes). But, when fractional seconds were added, the format was changed to something like
Length indication (1 byte)
INT UNSIGNED for seconds-since-1970 (4 bytes)
fractional seconds (0-3 bytes)
DATE is stored like MEDIUMINT UNSIGNED (3 bytes) as days since 0000-00-00 (or something like that).
How did you get the "raw bytes"? There is no function to let you do that. Select HEX(some-date) first converts to a string (like "2022-03-22") then takes the hex of it. That gives you 323032322D30332D3232.

About Code refrence the answer.
About document content check below words :
getBytes document links to ColumnMetaData document url.
ColumnMetaData document links to protobuf encoding url.
protobuf encoding url / Protocol Buffers Documentation Documentation say :
Base 128 Varints
Variable-width integers, or varints, are at the core of the wire
format. They allow encoding unsigned 64-bit integers using anywhere
between one and ten bytes, with small values using fewer bytes.
Each byte in the varint has a continuation bit that indicates if the
byte that follows it is part of the varint. This is the most
significant bit (MSB) of the byte (sometimes also called the sign
bit). The lower 7 bits are a payload; the resulting integer is built
by appending together the 7-bit payloads of its constituent bytes.
So, for example, here is the number 1, encoded as 01 – it’s a single
byte, so the MSB is not set:
0000 0001
^ msb
And here is 150, encoded as 9601 – this is a bit more complicated:
10010110 00000001
^ msb ^ msb
How do you figure out that this is 150? First you drop the MSB from
each byte, as this is just there to tell us whether we’ve reached the
end of the number (as you can see, it’s set in the first byte as there
is more than one byte in the varint). Then we concatenate the 7-bit
payloads, and interpret it as a little-endian, 64-bit unsigned
integer:
10010110 00000001 // Original inputs.
0010110 0000001 // Drop continuation bits.
0000001 0010110 // Put into little-endian order.
10010110 // Concatenate.
128 + 16 + 4 + 2 = 150 // Interpret as integer.
Because varints are so crucial to protocol buffers, in protoscope
syntax, we refer to them as plain integers. 150 is the same as 9601.
Each byte in the varint has a continuation bit that indicates if the byte that follows it is part of the varint.

Related

How type conversion is done in the following code?

In the below code, the variable Speed is of type int. How is it stored in two variables of char type? I also don't understand the comment // 16 bits - 2 x 8 bits variables.
Can u explain me with example for the type conversion because when I run the code it shows symbols after type conversion
AX12A::turn(unsigned char ID, bool SIDE, int Speed)
{
if (SIDE == LEFT)
{
char Speed_H,Speed_L;
Speed_H = Speed >> 8;
Speed_L = Speed; // 16 bits - 2 x 8 bits variables
}
}
main(){
ax12a.turn(ID,Left,200)
}
It seems like on your platform, a variable of type int is stored on 16 bits and a variable of type char is stored on 8 bits.
This does not always happen, as the C++ standard does not guarantee the size of these types. I made my assumption based on the code and the comment. Use data types of fixed size, such as the ones described here, to make sure this assumption is always going to be true.
Both int and char are integral types. When converting from a larger integral type to a smaller integral type (e.g. int to char), the most significant bits are discarded, and the least significant bits are kept (in this case, you keep the last 8 bits).
Before fully understanding the code, you also need to know about right shift. This simply moves the bits to the right (for the purpose of this answer, it does not matter what is inserted to the right). Therefore, the least significant bit (the rightmost bit) is discarded, every other bit is moved one space to the right. Very similar to division by 10 in the decimal system.
Now, you have your variable Speed, which has 16 bits.
Speed_H = Speed >> 8;
This shifts Speed with 8 bits to the right, and then assigns the 8 least significant bits to Speed_H. This basically means that you will have in Speed_H the 8 most significant bits (the "upper" half of Speed).
Speed_L = Speed;
Simply assigns to Speed_L the least significant 8 bits.
The comment basically states that you split a variable of 16 bits into 2 variables of 8 bits, with the first (most significant) 8 bits being stored in Speed_H and the last (least significant) 8 bits being stored in Speed_L.
From your code I understand that sizeof(int) = 2 bytes in your case.
Let us take example as shown below.
int my_var = 200;
my_var is allocated 2 bytes of memory address because datatype is ‘int’.
value assigned to my_var is 200.
Note that 200 decimal = 0x00C8 Hexadecimal = 0000 0000 1100 1000 binary
Higher byte 0000 0000 binary is stored in one of the addresses allocated to my_var
And lower byte 1100 1000 is stored in other address depending on endianness.
To know about endianness, check this link
https://www.geeksforgeeks.org/little-and-big-endian-mystery/
In your code :
int Speed = 200;
Speed_H = Speed >> 8;
=> 200 decimal value right shifted 8 times
=> that means 0000 0000 1100 1000 binary value right shifted by 8 bits
=> that means Speed_H = 0000 0000 binary
Speed_L = Speed;
=> Speed_L = 200;
=> Speed_L = 0000 0000 1100 1000 binary
=> Speed_L is of type char so it can accommodate only one byte
=> The value 0000 0000 1100 1000 will be narrowed (in other words "cut-off") to least significant byte and assigned to Speed_L.
=> Speed_L = 1100 1000 binary = 200 decimal

fastest way to convert int8 to int7

I've a function that takes int8_t val and converts it to int7_t.
//Bit [7] reserved
//Bits [6:0] = signed -64 to +63 offset value
// user who calls this function will use it correctly (-64 to +63)
uint8_t func_int7_t(int8_t val){
uint8_t val_6 = val & 0b01111111;
if (val & 0x80)
val_6 |= 0x40;
//...
//do stuff...
return val_6;
}
What is the best and fastest way to manipulate the int8 to int7? Did I do it efficient and fast? or there is better way?
The target is ARM Cortex M0+ if that matters
UPDATE:
After reading different answers I can say the question was asked wrong? (or my code in the question is what gave wrong assumptions to others) I had the intension to make an int8 to int7
So it will be done by doing nothing because
8bit:
63 = 0011 1111
62 = 0011 1110
0 = 0000 0000
-1 = 1111 1111
-2 = 1111 1110
-63 = 1100 0001
-64 = 1100 0000
7bit:
63 = 011 1111
62 = 011 1110
0 = 000 0000
-1 = 111 1111
-2 = 111 1110
-63 = 100 0001
-64 = 100 0000
the faster way is probably :
uint8_t val_7 = (val & 0x3f) | ((val >> 1) & 0x40);
val & 0x3f get the 6 lower bits (truncate) and ((val >> 1) & 0x40) move the bit to sign from the position 8 to 7
The advantage to not use a if is to have a shorter code (even you can use arithmetic if) and to have a code without sequence break
To clear the reserved bit, just
return val & 0x7f;
To leave the reserved bit exactly like how it was from input, nothing needs to be done
return val;
and the low 7 bits will contain the values in [-64, 63]. Because in two's complement down casting is done by a simple truncation. The value remains the same. That's what happens for an assignment like (int8_t)some_int_value
There's no such thing as 0bX1100001. There's no undefined bit in machine language. That state only exists in hardware, like the high-Z state or undefined state in Verilog or other hardware description languages
Use bitfield to narrow the value and let compiler to choose what sequence of shifts and/or masks is most efficient for that on your platform.
inline uint8_t to7bit(int8_t x)
{
struct {uint8_t x:7;} s;
return s.x = x;
}
If you are not concerned about what happens to out-of-range values, then
return val & 0x7f;
is enough. This correctly handles values in the range -64 <= val <= 63.
You haven't said how you want to handle out-of-range values, so I have nothing to say about that.
Updated to add: The question has been updated so stipulate that the function will never be called with out-of-range values. So this method qualifies unambiguously as "best and fastest".
the user who calls this function he knows he should put data -64 to +63
So not considering any other values, the really fastest thing you can do is not doing anything at all!
You have a 7 bit value stored in eight bits. Any value within specified range will have both bit 7 and bit 6 the same value, and when you process the 7-bit value, you just ignore the MSB (of 8-bit value), no matter if set or not, e. g.:
for(unsigned int bit = 0x40; bit; bit >>= 1)
// NOT: 0x80!
std::cout << (value & bit);
The other way round is more critical: whenever you receive these seven bits via some communication channel, then you need to do manual sign extension for eight (or more) bits to be able to correctly use that value.

Unsure of normalising double values loaded as 2 bytes each

The code that I'm using for reading .wav file data into an 2D array:
int signal_frame_width = wavHeader.SamplesPerSec / 100; //10ms frame
int total_number_of_frames = numSamples / signal_frame_width;
double** loadedSignal = new double *[total_number_of_frames]; //array that contains the whole signal
int iteration = 0;
int16_t* buffer = new int16_t[signal_frame_width];
while ((bytesRead = fread(buffer, sizeof(buffer[0]), signal_frame_width, wavFile)) > 0)
{
loadedSignal[iteration] = new double[signal_frame_width];
for(int i = 0; i < signal_frame_width; i++){
//value normalisation:
int16_t c = (buffer[i + 1] << 8) | buffer[i];
double normalisedValue = c/32768.0;
loadedSignal[iteration][i] = normalisedValue;
}
iteration++;
}
The problem is in this part, I don't exaclty understand how it works:
int16_t c = (buffer[i + 1] << 8) | buffer[i];
It's example taken from here.
I'm working on 16bit .wav files only. As you can see, my buffer is loading (for ex. sampling freq. = 44.1kHz) 441 elements (each is 2byte signed sample). How should I change above code?
The original example, from which you constructed your code, used an array where each individual element represented a byte. It therefore needs to combine two consecutive bytes into a 16-bit value, which is what this line does:
int16_t c = (buffer[i + 1] << 8) | buffer[i];
It shifts the byte at index i+1 (here assumed to be the most significant byte) left by 8 positions, and then ORs the byte at index i onto that. For example, if buffer[i+1]==0x12 and buffer[i]==0x34, then you get
buffer[i+1] << 8 == 0x12 << 8 == 0x1200
0x1200 | buffer[i] == 0x1200 | 0x34 == 0x1234
(The | operator is a bitwise OR.)
Note that you need to be careful whether your WAV file is little-endian or big-endian (but the original post explains that quite well).
Now, if you store the resulting value in a signed 16-bit integer, you get a value between −32768 and +32767. The point in the actual normalization step (dividing by 32768) is just to bring the value range down to [−1.0, 1.0).
In your case above, you appear to already be reading into a buffer of 16-bit values. Note that your code will therefore only work if the endianness of your platform matches that of the WAV file you are working with. But if this assumption is correct, then you don't need the code line which you do not understand. You can just convert every array element into a double directly:
double normalisedValue = buffer[i]/32768.0;
If buffer was an array of bytes, then that piece of code would interpret two consecutive bytes as a single 16-bit integer (assuming little-endian encoding). The | operator will perform a bit-wise OR on the bits of the two bytes. Since we wish to interpret the two bytes as a single 2-byte integer, then we must shift the bits of one of them 8 bits (1 byte) to the left. Which one depends on whether they are ordered in little-endian or big-endian order. Little-endian means that the least significant byte comes first, so we shift the second byte 8 bits to the left.
Example:
First byte: 0101 1100
Second byte: 1111 0100
Now shift second byte:
Second "byte": 1111 0100 0000 0000
First "byte": 0000 0000 0101 1100
Bitwise OR-operation (if either is 1, then 1. If both are 0, then 0):
16-bit integer: 1111 0100 0101 1100
In your case however, the bytes in your file have already been interpreted as 16-bit ints using whatever endianness the platform has. So you do not need this step. However, in order to correctly interpret the bytes in the file, one must assume the same byte-order as they were written in. Therefore, one usually adds this step to ensure that the code works independent of the endianness of the platform, instead relying on the expected byte-order of the files (as most file formats will specify what the byte-order should be).

bitwise shifts, unsigned chars

Can anyone explain verbosely what this accomplishes? Im trying to learn c and am having a hard time wrapping my head around it.
void tonet_short(uint8_t *p, unsigned short s) {
p[0] = (s >> 8) & 0xff;
p[1] = s & 0xff;
}
void tonet_long(uint8_t *p, unsigned long l)
{
p[0] = (l >> 24) & 0xff;
p[1] = (l >> 16) & 0xff;
p[2] = (l >> 8) & 0xff;
p[3] = l & 0xff;
}
Verbosely, here it goes:
As a direct answer; both of them stores the bytes of a variable inside an array of bytes, from left to right. tonet_short does that for unsigned short variables, which consist of 2 bytes; and tonet_long does it for unsigned long variables, which consist of 4 bytes.
I will explain it for tonet_long, and tonet_short will just be the variation of it that you'll hopefully be able to derive yourself:
unsigned variables, when their bits are bitwise-shifted, get their bits shifted towards the determined side for determined amount of bits, and the vacated bits are made to be 0, zeros. I.e.:
unsigned char asd = 10; //which is 0000 1010 in basis 2
asd <<= 2; //shifts the bits of asd 2 times towards left
asd; //it is now 0010 1000 which is 40 in basis 10
Keep in mind that this is for unsigned variables, and these may be incorrect for signed variables.
The bitwise-and & operator compares the bits of two operands on both sides, returns a 1 (true) if both are 1 (true), and 0 (false) if any or both of them are 0 (false); and it does this for each bit. Example:
unsigned char asd = 10; //0000 1010
unsigned char qwe = 6; //0000 0110
asd & qwe; //0000 0010 <-- this is what it evaluates to, which is 2
Now that we know the bitwise-shift and bitwise-and, let's get to the first line of the function tonet_long:
p[0] = (l >> 24) & 0xff;
Here, since l is unsigned long, the (l >> 24) will be evaluated into the first 4 * 8 - 24 = 8 bits of the variable l, which is the first byte of the l. I can visualize the process like this:
abcd efgh ijkl mnop qrst uvwx yz.. .... //letters and dots stand for
//unknown zeros and ones
//shift this 24 times towards right
0000 0000 0000 0000 0000 0000 abcd efgh
Note that we do not change the l, this is just the evaluation of l >> 24, which is temporary.
Then the 0xff which is just 0000 0000 0000 0000 0000 0000 1111 1111 in hexadecimal (base 16), gets bitwise-anded with the bitwise-shifted l. It goes like this:
0000 0000 0000 0000 0000 0000 abcd efgh
&
0000 0000 0000 0000 0000 0000 1111 1111
=
0000 0000 0000 0000 0000 0000 abcd efgh
Since a & 1 will be simply dependent strictly on a, so it will be a; and same for the rest... It looks like a redundant operation for this, and it really is. It will, however, be important for the rest. This is because, for example, when you evaluate l >> 16, it looks like this:
0000 0000 0000 0000 abcd efgh ijkl mnop
Since we want only the ijkl mnop part, we have to discard the abcd efgh, and that will be done with the aid of 0000 0000 that 0xff has on its corresponding bits.
I hope this helps, the rest happens like it does this far, so... yeah.
These routines convert 16 and 32 bit values from native byte order to standard network(big-endian) byte order. They work by shifting and masking 8-bit chunks from the native value and storing them in order into a byte array.
If I see it right, I basically switches the order of bytes in the short and in the long ... (reverses the byte order of the number) and stores the result at an address which hopefully has enough space :)
explain verbosely - OK...
void tonet_short(uint8_t *p, unsigned short s) {
short is typically a 16-bit value (max: 0xFFFF)
The uint8_t is an unsigned 8-bit value, and p is a pointer to some number of unsigned 8-bit values (from the code we're assuming at least 2 sequential ones).
p[0] = (s >> 8) & 0xff;
This takes the "top half" of the value in s and puts it in the first element in the array p. So let's assume s==0x1234.
First s is shifted by 8 bits (s >> 8 == 0x0012)then it's AND'ed with 0xFF and the result is stored in p[0]. (p[0] == 0x12)
p[1] = s & 0xff;
Now note that when we did that shift, we never changed the original value of s, so s still has the original value of 0x1234, thus when we do this second line we simply do another bit-wise AND and p[1] get the "lower half" of the value of s (p[0] == 0x34)
The same applies for the other function you have there, but it's a long instead of a short, so we're assuming p in this case has enough space for all 32-bits (4x8) and we have to do some extra shifts too.
This code is used to serialize a 16-bit or 32-bit number into bytes (uint8_t). For example, to write them to disk, or to send them over a network connection.
A 16-bit value is split into two parts. One containing the most-significant (upper) 8 bits, the other containing least-significant (lower) 8 bits. The most-significant byte is stored first, then the least-significant byte. This is called big endian or "network" byte order. That's why the functions are named tonet_.
The same is done for the four bytes of a 32-bit value.
The & 0xff operations are actually useless. When a 16-bit or 32-bit value is converted to an 8-bit value, the lower 8 bits (0xff) are masked implicitly.
The bit-shifts are used to move the needed byte into the lowest 8 bits. Consider the bits of a 32-bit value:
AAAAAAAABBBBBBBBCCCCCCCCDDDDDDDD
The most significant byte are the 8 bits named A. In order to move them into the lowest 8 bits, the value has to be right-shifted by 24.
The names of the functions are a big hint... "to net short" and "to net long".
If you think about decimal... say we have a two pieces of paper so small we can only write one digit on each of them, we can therefore use both to record all the numbers from 0 to 99: 00, 01, 02... 08, 09, 10, 11... 18, 19, 20...98, 99. Basically, one piece of paper holds the "tens" column (given we're in base 10 for decimal), and the other the "units".
Memory works like that where each byte can store a number from 0..255, so we're working in base 256. If you have two bytes, one of them's going to be the "two-hundred-and-fifty-sixes" column, and the other the "units" column. To work out the combined value, you multiple the former by 256 and add the latter.
On paper we write numbers with the more significant ones on the left, but on a computer it's not clear if a more significant value should be in a higher or lower memory address, so different CPU manufacturers picked different conventions.
Consequently, some computers store 258 - which is 1 * 256 + 2 - as low=1 high=2, while others store low=2 high=1.
What these functions do is rearrange the memory from whatever your CPU happens to use to a predictable order - namely, the more significant value(s) go into the lower memory addresses, and eventually the "units" value is put into the highest memory address. This is a consistent way of storing the numbers that works across all computer types, so it's great when you want to transfer the data over the network; if the receiving computer uses a different memory ordering for the base-256 digits, it can move them from network byte ordering to whatever order it likes before interpreting them as CPU-native numbers.
So, "to net short" packs the most significant 8 bits of s into p[0] - the lower memory address. It didn't actually need to & 0xff as after taking the 16 input bits and shifting them 8 to the "right", all the left-hand 8 bits are guaranteed 0 anyway, which is the affect from & 0xFF - for example:
1010 1111 1011 0111 // = decimal 10*256^3 + 15*256^2 + 11*256 + 7
>>8 0000 0000 1010 1111 // move right 8, with left-hand values becoming 0
0xff 0000 0000 1111 1111 // we're going to and the above with this
& 0000 0000 1010 1111 // the bits that were on in both the above 2 values
// (the and never changes the value)

Saving a rs232 message to a variable

If I receive a message via RS232 consisting of 2 Byte length, e.g. 0000 0001 0001 1100 (that is 100011100, lsb on the right), I wanna save it to a variable called value.
I am "decoding" the byte stream with this step:
rxByte = Serial1.read()
messageContent[0] = rxByte
messageContent[1] = rxByte
with the first rxByte having the value 0000 0001 and the second 0001 1100.
Or are those values already converted internally to HEX or DEC?
Now I have seen code that saves it this way to value:
uint32_t value = messageContent[0] *256 + messageContent[0]
How does this work?
messageContent[0] *256 is essentially a bitshift: the code is equivelent to (and more readable as)
uint32_t value = (messageContext[0]) << 8 + messageContent[1];
So if `messageContent[0] = 0x01' and messageContext[2] = 0x1C
value = (0x01 << 8)+0x1C
value = (0x0100)+0x1C
value = 0x011C
Works find, but depending on the endianess of your machine, it is equivalent to:
uint32_t value = *((uint16_t*)(messageContext));
Decoding procedure:
//char messageContent[2]; //Always keep in mind datatypes in use!!!
messageContent[0] = Serial1.read()
messageContent[1] = Serial1.read()
Way you were doing was placing same value in both positions.
If you want to read both bytes into a 16-bit or bigger integer:
short int messageContent = Serial1.read()<<8+Serial.read();
Or are those values already converted internally to HEX or DEC?
Data is always binary. Hex or Dec is just its representation. You say "variable x as a value of 123" - this is a human interpretation, actually variable x is a block of memory comprised of some bytes which are by themselves groups of 8 bits.
Now I have seen code that saves it this way to value:
uint32_t value = messageContent[0] *256 + messageContent[0]
That's like I tell you 45 thousands and 123, so you build your number as 45*1000+123=45123. 256 is 2^8, equal to a full byte, b'1 0000 0000'.