Comparing an usart received uint8_t* data with a constant string - c++

I'm working on an Arduino Due, trying to use DMA functions as I'm working on a project where speed is critical. I found the following function to receive through serial:
uint8_t DmaSerial::get(uint8_t* bytes, uint8_t length) {
// Disable receive PDC
uart->UART_PTCR = UART_PTCR_RXTDIS;
// Wait for PDC disable to take effect
while (uart->UART_PTSR & UART_PTSR_RXTEN);
// Modulus needed if RNCR is zero and RPR counts to end of buffer
rx_tail = (uart->UART_RPR - (uint32_t)rx_buffer) % DMA_SERIAL_RX_BUFFER_LENGTH;
// Make sure RPR follows (actually only needed if RRP is counted to the end of buffer and RNCR is zero)
uart->UART_RPR = (uint32_t)rx_buffer + rx_tail;
// Update fill counter
rx_count = DMA_SERIAL_RX_BUFFER_LENGTH - uart->UART_RCR - uart->UART_RNCR;
// No bytes in buffer to retrieve
if (rx_count == 0) { uart->UART_PTCR = UART_PTCR_RXTEN; return 0; }
uint8_t i = 0;
while (length--) {
bytes[i++] = rx_buffer[rx_head];
// If buffer is wrapped, increment RNCR, else just increment the RCR
if (rx_tail > rx_head) { uart->UART_RNCR++; } else { uart->UART_RCR++; }
// Increment head and account for wrap around
rx_head = (rx_head + 1) % DMA_SERIAL_RX_BUFFER_LENGTH;
// Decrement counter keeping track of amount data in buffer
rx_count--;
// Buffer is empty
if (rx_count == 0) { break; }
}
// Turn on receiver
uart->UART_PTCR = UART_PTCR_RXTEN;
return i;
}
So, as far as I understand, this function writes to the variable bytes, as a pointer, what is received as long as is no longer than length. So I'm calling it this way:
dma_serial1.get(data, 8);
without assigning its returning value to a variable. I'm thinking the received value is stored to the uint8_t* data but I might be wrong.
Finally, what I want to do is to check if the received data is a certain char to take decisions, like this:
if (data == "t"){
//do something//}
How could I make this work?

For comparing strings like intended by if (data == "t"), you'll need a string comparison function like, for example, strcmp. For this to work, you must ensure that the arguments are actually (0-terminated) C-strings:
uint8_t data[9];
uint8_t size = dma_serial1.get(data, 8);
data[size]='\0';
if (strcmp(data,"t")==0) {
...
}
In case that the default character type in your environment is signed char, to pass data directly to string functions, a cast is needed from unsigned to signed:
if (strcmp(reinterpret_cast<const char*>(data),"t")==0) {
...
}
So a complete MVCE could look as follows:
int get(uint8_t *data, int size) {
data[0] = 't';
return 1;
}
int main()
{
uint8_t data[9];
uint8_t size = get(data, 8);
data[size]='\0';
if (strcmp(reinterpret_cast<const char*>(data),"t")==0) {
cout << "found 't'" << endl;
}
}
Output:
found 't'

Related

Parsing Message with Varying Fields

I have a byte stream that represents a message in my application. There are 5 fields in the message for demonstration. The first byte in the stream indicates which message fields are present for the current stream. For instance 0x2 in the byte-0 means only the Field-1 is present for the current stream.
The mask field might have 2^5=32 different values. To parse this varying width of message, I wrote the example structure and parser below. My question is, is there any other way to parse such dynamically changing fields? If the message had 64 fields with I would have to write 64 cases, which is cumbersome.
#include <iostream>
typedef struct
{
uint8_t iDummy0;
int iDummy1;
}__attribute__((packed, aligned(1)))Field4;
typedef struct
{
int iField0;
uint8_t ui8Field1;
short i16Field2;
long long i64Field3;
Field4 stField4;
}__attribute__((packed, aligned(1)))MessageStream;
char* constructIncomingMessage()
{
char* cpStream = new char(1+sizeof(MessageStream)); // Demonstrative message byte array
// 1 byte for Mask, 20 bytes for messageStream
cpStream[0] = 0x1F; // the 0-th byte is a mask marking
// which fields are present for the messageStream
// all 5 fields are present for the example
return cpStream;
}
void deleteMessage( char* cpMessage)
{
delete cpMessage;
}
int main() {
MessageStream messageStream; // Local storage for messageStream
uint8_t ui8FieldMask; // Mask to indicate which fields of messageStream
// are present for the current incoming message
const uint8_t ui8BitIsolator = 0x01;
uint8_t ui8FieldPresent; // ANDed result of Mask and Isolator
std::size_t szParsedByteCount = 0; // Total number of parsed bytes
const std::size_t szMaxMessageFieldCount = 5; // There can be maximum 5 fields in
// the messageStream
char* cpMessageStream = constructIncomingMessage();
ui8FieldMask = (uint8_t)cpMessageStream[0];
szParsedByteCount += 1;
for(std::size_t i = 0; i<szMaxMessageFieldCount; ++i)
{
ui8FieldPresent = ui8FieldMask & ui8BitIsolator;
if(ui8FieldPresent)
{
switch(i)
{
case 0:
{
memcpy(&messageStream.iField0, cpMessageStream+szParsedByteCount, sizeof(messageStream.iField0));
szParsedByteCount += sizeof(messageStream.iField0);
break;
}
case 1:
{
memcpy(&messageStream.ui8Field1, cpMessageStream+szParsedByteCount, sizeof(messageStream.ui8Field1));
szParsedByteCount += sizeof(messageStream.ui8Field1);
break;
}
case 2:
{
memcpy(&messageStream.i16Field2, cpMessageStream+szParsedByteCount, sizeof(messageStream.i16Field2));
szParsedByteCount += sizeof(messageStream.i16Field2);
break;
}
case 3:
{
memcpy(&messageStream.i64Field3, cpMessageStream+szParsedByteCount, sizeof(messageStream.i64Field3));
szParsedByteCount += sizeof(messageStream.i64Field3);
break;
}
case 4:
{
memcpy(&messageStream.stField4, cpMessageStream+szParsedByteCount, sizeof(messageStream.stField4));
szParsedByteCount += sizeof(messageStream.stField4);
break;
}
default:
{
std::cerr << "Undefined Message field number: " << i << '\n';
break;
}
}
}
ui8FieldMask >>= 1; // shift the mask
}
delete deleteMessage(cpMessageStream);
return 0;
}
The first thing I'd change is to drop the __attribute__((packed, aligned(1))) on Field4. This is a hack to create structures which mirror a packed wire-format, but that's not the format you're dealing with anyway.
Next, I'd make MessageStream a std::tuple of std::optional<T> fields.
You now know that there are std::tuple_size<MessageStream> possible bits in the mask. Obviously you can't fit 64 bits in a ui8FieldMask but I'll assume that's a trivial problem to solve.
You can write a for-loop from 0 to std::tuple_size<MessageStream> to extract the bits from ui8FieldMask to see which bits are set. The slight problem with that logic is that you'll need compile-time constants I for std::get<size_t I>(MessageStream), and a for-loop only gives you run-time variables.
Hence, you'll need a recursive template <size_t I> extract(char const*& cpMessageStream, MessageStream&), and of course a specialization extract<0>. In extract<I>, you can use typename std::tuple_element<I, MessageStream>::type to get the std::optional<T> at the I'th position in your MessageStream.

circular buffer implementation

Yes again, I come again with that very straight forward implementation which is something like this:
// write data always! if buffer is already full, overwrite old data!
void Put( const CONTENT_TYPE &data )
{
buffer[ inOffset++] = data;
inOffset%=size;
// was data overwritten, skip it by increment read offset
if ( inOffset == outOffset )
{
outOffset++;
outOffset%=size;
std::cout << "Overwrite" << std::endl;
}
}
CONTENT_TYPE Pull()
{
CONTENT_TYPE data = buffer[ outOffset++ ];
outOffset %= size;
return data;
}
But this simple algorithm utilizes only size-1 one elements of the buffer!
If I want to avoid that, I only found a solution with adding another counter variable, which wastes me sizeof(counter_var) - sizeof(element) bytes.
Q: Is there a solution which did not waste memory? It looks so terrible simple but I can't catch it :-)
Remark: There are some more lines of code to protect for empty reads and other stuff, but this is not important to the question. And it is not tagged c++ because the algorithm did not depend on the language, also if I give a c++ code example.
You can use two integers and fill all slots if one is an index and the other an element count, then convert to find the second index on the fly:
void put(const ELEMENT& element) {
if (nElements == size) throw "put: buffer full";
buffer[(start + nElements++) % size] = element;
}
ELEMENT get() {
if (nElements == 0) throw "get: buffer empty";
ELEMENT& value = buffer[start];
start = (start + 1) % size;
--nElements;
return value;
}
Of course you can replace the mod operations with if (foo > size) foo -= size; if you like.
You'd just deal with that by using different points in time at which you do the modulo operation; assume we increase the read and write pointers after every access. If we now do the read pointer's modulo instantly after increasing, and the write pointer's modulo just right before reading, the |write-read| of a full buffer would be the length of the buffer, without any special case handling. For that to work, your write pointer should always be used % buffer_length, but stored % (2 * buffer_length).
I don't especially like Mark's answer, because handling things as special cases is usually not a good idea, as little as introducing negative sentinel values is in a place where you'd typically used size_t (i.e. an unsigned integer).
You could use a special sentinel value for one of the offsets, such as -1, to indicate that the buffer is full or empty. This will complicate your code for checking and modifying the offset.
// write data always! if buffer is already full, overwrite old data!
void Put( const CONTENT_TYPE &data )
{
buffer[ inOffset++] = data;
inOffset%=size;
// was data overwritten, skip it by setting read offset to sentinel
if ( inOffset == outOffset || outOffset == -1 )
{
outOffset = -1;
std::cout << "Overwrite" << std::endl;
}
}
CONTENT_TYPE Pull()
{
if (outOffset == -1)
outOffset = inOffset;
CONTENT_TYPE data = buffer[ outOffset++ ];
outOffset %= size;
return data;
}
bool IsEmpty()
{
return outOffset == inOffset;
}

Is there a better way to handle incomplete data in a buffer and reading?

I am processing a binary file that is built up of events. Each event can have a variable length. Since my read buffer is a fixed size I handle things as follows:
const int bufferSize = 0x500000;
const int readSize = 0x400000;
const int eventLengthMask = 0x7FFE0000;
const int eventLengthShift = 17;
const int headerLengthMask = 0x1F000;
const int headerLengthShift = 12;
const int slotMask = 0xF0;
const int slotShift = 4;
const int channelMask = 0xF;
...
//allocate the buffer we allocate 5 MB even though we read in 4MB chunks
//to deal with unprocessed data from the end of a read
char* allocBuff = new char[bufferSize]; //inFile reads data into here
unsigned int* buff = reinterpret_cast<unsigned int*>(allocBuff); //data is interpretted from here
inFile.open(fileName.c_str(),ios_base::in | ios_base::binary);
int startPos = 0;
while(!inFile.eof())
{
int index = 0;
inFile.read(&(allocBuff[startPos]), readSize);
int size = ((readSize + startPos)>>2);
//loop to process the buffer
while (index<size)
{
unsigned int data = buff[index];
int eventLength = ((data&eventLengthMask)>>eventLengthShift);
int headerLength = ((data&headerLengthMask)>>headerLengthShift);
int slot = ((data&slotMask)>>slotShift);
int channel = data&channelMask;
//now check if the full event is in the buffer
if( (index+eventLength) > size )
{//the full event is not in the buffer
break;
}
++index;
//further processing of the event
}
//move the data at the end of the buffer to the beginning and set start position
//for the next read
for(int i = index; i<size; ++i)
{
buff[i-index] = buff[i];
}
startPos = ((size-index)<<2);
}
My question is this: Is there a better to handle having unprocessed data at the end of the buffer?
You could improve it by using a circular buffer rather than a simple array. That, or a circular iterator over the array. Then you don't need to do all that copying — the "start" of the array moves.
Other than that, no, not really.
When I encountered this problem in the past, I simply copied the
unprocessed data down, and then read from the end of it. This
is a valid solution (and by far the simplest) if the individual
elements are fairly small and the buffer is large. (On a modern
machine, "fairly small" can easily be anything up to a couple of
hundred KB.) Of course, you'll have to keep track of how much
you've copied down, to adjust the pointer and the size of the
next read.
Beyond that:
You'd be better off using std::vector<char> for the buffer.
You can't convert four bytes read from a disk into an
unsigned int just by casting its address; you have to insert
each of the bytes into the unsigned int where it belongs.
And finally: you don't check that the read has succeeded
before processing the data. Using unbuffered input with an
istream is a bit tricky: your loop should probably be
something like
while ( inFile.read( addr, len ) || inFile.gcount() != 0 )....

C++ defensive programming: reading from a buffer with type safety

Let's say I have a class that I don't own: DataBuffer. It provides various get member functions:
get(uint8_t *value);
get(uint16_t *value);
...
When reading from a structure contained in this buffer, I know the order and size of fields, and I want to reduce the chance of future code changes causing an error:
struct Record
{
uint16_t Header;
uint16_t Content;
}
void ReadIntoRecord(Record* r)
{
DataBuffer buf( initialized from the network with bytes )
buf.get(&r->Header); // Good!
buf.get(&r->Content);
}
Then someone checks in a change to do something with the header before writing it:
uint8_t customHeader;
buf.get(&customHeader); // Wrong, stopped reading after only 1 byte
r->Header = customHeader + 1;
buf.get(&r->Content); // now we're reading from the wrong part of the buffer.
Is the following an acceptable way to harden the code against changes? Remember, I can't change the function names to getByte, getUShort, etc. I could inherit from DataBuffer, but that seems like overkill.
buf.get(static_cast<uint16_t*>(&r->Header)); // compiler will catch incorrect variable type
buf.get(static_cast<uint16_t*>(&r->Content))
Updated with not-eye-safe legacy code example:
float dummy_float;
uint32_t dummy32;
uint16_t dummy16;
uint8_t dummy8;
uint16_t headTypeTemp;
buf.get(static_cast<uint16_t*>(&headTypeTemp));
m_headType = HeadType(headTypeTemp);
buf.get(static_cast<uint8_t*>(&hid));
buf.get(m_Name);
buf.get(m_SerialNumber);
float start;
buf.get(static_cast<float*>(&start));
float stop;
buf.get(static_cast<float*>(&stop));
buf.get(static_cast<float*>(&dummy_float));
setStuffA(dummy_float);
buf.get(static_cast<uint16_t*>(&dummy16));
setStuffB(float(dummy16)/1000);
buf.get(static_cast<uint8_t*>(&dummy8)); //reserved
buf.get(static_cast<uint32_t*>(&dummy32));
Entries().setStart( dummy32 );
buf.get(static_cast<uint32_t*>(&dummy32));
Entries().setStop( dummy32 );
buf.get(static_cast<float*>(&dummy_float));
Entries().setMoreStuff( dummy_float );
uint32_t datalength;
buf.get(static_cast<uint32_t*>(&datalength));
Entries().data().setLength(datalength);
RetVal ret = ReturnCode::SUCCESS;
Entry* data_ptr = Entries().data().data();
for (unsigned int i = 0; i < datalength && ret == ReturnCode::SUCCESS; i++)
{
ret = buf.get(static_cast<float*>(&dummy_float));
data_ptr[i].FieldA = dummy_float;
}
for (unsigned int i = 0; i < datalength && ret == ReturnCode::SUCCESS; i++)
{
ret = buf.get(static_cast<float*>(&dummy_float));
data_ptr[i].FieldB = dummy_float;
}
// Read in the normalization vector
Util::SimpleVector<float> norm;
buf.get(static_cast<uint32_t*>(&datalength));
norm.setLength(datalength);
for (unsigned int i=0; i<datalength; i++)
{
norm[i] = buf.getFloat();
}
setNormalization(norm);
return ReturnCode::SUCCESS;
}
Don't use overloading. Why not have get_word and get_dword calls? The interface isn't going to be any uglier but at least the mistake is a lot harder to make.
wouldn't it be better to read the whole struct from the network? Letting the user do all the socket operations seems like a bad idea to me (not encapsulated). Encapsulate the stuff you want to send on the network to operate on file descriptors instead of letting the user put raw buffer data to the file descriptors.
I can imagine something like
void readHeader(int filedes, struct Record * Header);
so you can do something like this
struct Record
{
uint16_t Header;
uint16_t Content;
uint16_t getHeader() const { return Header; }
uint16_t getContent() const { return Content; }
};
/* socket stuff to get filedes */
struct Record x;
readHeader(fd, &x);
x.getContent();
You can't read from buffer with type safety unless the buffer contains information about the content. One simple method is to add length to each structure and check that at least the data being read is still the sane length. You could also use XML or ASN.1 or something similar where type information is provided. Of course I'm assuming that you also write to that buffer.

C code - need to clarify the effectiveness

Hi I have written a code based upon a requirement.
(field1_6)(field2_30)(field3_16)(field4_16)(field5_1)(field6_6)(field7_2)(field8_1).....
this is one bucket(8 fields) of data. we will receive 20 buckets at a time means totally 160 fields.
i need to take the values of field3,field7 & fields8 based upon predefined condition.
if teh input argument is N then take the three fields from 1st bucket and if it is Y i need
to take the three fields from any other bucket other than 1st one.
if argumnet is Y then i need to scan all the 20 buckets one after other and check
the first field of the bucket is not equal to 0 and if it is true then fetch the three fields of that bucket and exit.
i have written the code and its also working fine ..but not so confident that it is effctive.
i am afraid of a crash some time.please suggest below is the code.
int CMI9_auxc_parse_balance_info(char *i_balance_info,char *i_use_balance_ind,char *o_balance,char *o_balance_change,char *o_balance_sign
)
{
char *pch = NULL;
char *balance_id[MAX_BUCKETS] = {NULL};
char balance_info[BALANCE_INFO_FIELD_MAX_LENTH] = {0};
char *str[160] = {NULL};
int i=0,j=0,b_id=0,b_ind=0,bc_ind=0,bs_ind=0,rc;
int total_bukets ;
memset(balance_info,' ',BALANCE_INFO_FIELD_MAX_LENTH);
memcpy(balance_info,i_balance_info,BALANCE_INFO_FIELD_MAX_LENTH);
//balance_info[BALANCE_INFO_FIELD_MAX_LENTH]='\0';
pch = strtok (balance_info,"*");
while (pch != NULL && i < 160)
{
str[i]=(char*)malloc(strlen(pch) + 1);
strcpy(str[i],pch);
pch = strtok (NULL, "*");
i++;
}
total_bukets = i/8 ;
for (j=0;str[b_id]!=NULL,j<total_bukets;j++)
{
balance_id[j]=str[b_id];
b_id=b_id+8;
}
if (!memcmp(i_use_balance_ind,"Y",1))
{
if (atoi(balance_id[0])==1)
{
memcpy(o_balance,str[2],16);
memcpy(o_balance_change,str[3],16);
memcpy(o_balance_sign,str[7],1);
for(i=0;i<160;i++)
free(str[i]);
return 1;
}
else
{
for(i=0;i<160;i++)
free(str[i]);
return 0;
}
}
else if (!memcmp(i_use_balance_ind,"N",1))
{
for (j=1;balance_id[j]!=NULL,j<MAX_BUCKETS;j++)
{
b_ind=(j*8)+2;
bc_ind=(j*8)+3;
bs_ind=(j*8)+7;
if (atoi(balance_id[j])!=1 && atoi( str[bc_ind] )!=0)
{
memcpy(o_balance,str[b_ind],16);
memcpy(o_balance_change,str[bc_ind],16);
memcpy(o_balance_sign,str[bs_ind],1);
for(i=0;i<160;i++)
free(str[i]);
return 1;
}
}
for(i=0;i<160;i++)
free(str[i]);
return 0;
}
for(i=0;i<160;i++)
free(str[i]);
return 0;
}
My feeling is that this code is very brittle. It may well work when given good input (I don't propose to desk check the thing for you) but if given some incorrect inputs it will either crash and burn or give misleading results.
Have you tested for unexpected inputs? For example:
Suppose i_balance_info is null?
Suppose i_balance_info is ""?
Suppose there are fewer than 8 items in the input string, what will this line of code do?
memcpy(o_balance_sign,str[7],1);
Suppose that that the item in str[3] is less than 16 chars long, what will this line of code do?
memcpy(o_balance_change,str[3],16);
My approach to writing such code would be to protect against all such eventualities. At the very least I would add ASSERT() statements, I would usually write explicit input validation and return errors when it's bad. The problem here is that the interface does not seem to allow for any possibility that there might be bad input.
I had a hard time reading your code but FWIW I've added some comments, HTH:
// do shorter functions, long functions are harder to follow and make errors harder to spot
// document all your variables, at the very least your function parameters
// also what the function is suppose to do and what it expects as input
int CMI9_auxc_parse_balance_info
(
char *i_balance_info,
char *i_use_balance_ind,
char *o_balance,
char *o_balance_change,
char *o_balance_sign
)
{
char *balance_id[MAX_BUCKETS] = {NULL};
char balance_info[BALANCE_INFO_FIELD_MAX_LENTH] = {0};
char *str[160] = {NULL};
int i=0,j=0,b_id=0,b_ind=0,bc_ind=0,bs_ind=0,rc;
int total_bukets=0; // good practice to initialize all variables
//
// check for null pointers in your arguments, and do sanity checks for any
// calculations
// also move variable declarations to just before they are needed
//
memset(balance_info,' ',BALANCE_INFO_FIELD_MAX_LENTH);
memcpy(balance_info,i_balance_info,BALANCE_INFO_FIELD_MAX_LENTH);
//balance_info[BALANCE_INFO_FIELD_MAX_LENTH]='\0'; // should be BALANCE_INFO_FIELD_MAX_LENTH-1
char *pch = strtok (balance_info,"*"); // this will potentially crash since no ending \0
while (pch != NULL && i < 160)
{
str[i]=(char*)malloc(strlen(pch) + 1);
strcpy(str[i],pch);
pch = strtok (NULL, "*");
i++;
}
total_bukets = i/8 ;
// you have declared char*str[160] check if enough b_id < 160
// asserts are helpful if nothing else assert( b_id < 160 );
for (j=0;str[b_id]!=NULL,j<total_bukets;j++)
{
balance_id[j]=str[b_id];
b_id=b_id+8;
}
// don't use memcmp, if ('y'==i_use_balance_ind[0]) is better
if (!memcmp(i_use_balance_ind,"Y",1))
{
// atoi needs balance_id str to end with \0 has it?
if (atoi(balance_id[0])==1)
{
// length assumptions and memcpy when its only one byte
memcpy(o_balance,str[2],16);
memcpy(o_balance_change,str[3],16);
memcpy(o_balance_sign,str[7],1);
for(i=0;i<160;i++)
free(str[i]);
return 1;
}
else
{
for(i=0;i<160;i++)
free(str[i]);
return 0;
}
}
// if ('N'==i_use_balance_ind[0])
else if (!memcmp(i_use_balance_ind,"N",1))
{
// here I get a headache, this looks just at first glance risky.
for (j=1;balance_id[j]!=NULL,j<MAX_BUCKETS;j++)
{
b_ind=(j*8)+2;
bc_ind=(j*8)+3;
bs_ind=(j*8)+7;
if (atoi(balance_id[j])!=1 && atoi( str[bc_ind] )!=0)
{
// length assumptions and memcpy when its only one byte
// here u assume strlen(str[b_ind])>15 including \0
memcpy(o_balance,str[b_ind],16);
// here u assume strlen(str[bc_ind])>15 including \0
memcpy(o_balance_change,str[bc_ind],16);
// here, besides length assumption you could use a simple assignment
// since its one byte
memcpy(o_balance_sign,str[bs_ind],1);
// a common practice is to set pointers that are freed to NULL.
// maybe not necessary here since u return
for(i=0;i<160;i++)
free(str[i]);
return 1;
}
}
// suggestion do one function that frees your pointers to avoid dupl
for(i=0;i<160;i++)
free(str[i]);
return 0;
}
for(i=0;i<160;i++)
free(str[i]);
return 0;
}
A helpful technique when you want to access offsets in an array is to create a struct that maps the memory layout. Then you cast your pointer to a pointer of the struct and use the struct members to extract information instead of your various memcpy's
I would also suggest you reconsider your parameters to the function in general, if you place every of them in a struct you have better control and makes the function more readable e.g.
int foo( input* inbalance, output* outbalance )
(or whatever it is you are trying to do)