Question about C++ - c++

This is perhaps quite a simple one for some of you.
I was looking at the following serial read function and I can't quite comprehend what &prefix[2] does here. Does it mean only two bytes can be filled or something else ?
I should also mention this is part of the player/stage platform.
while (1)
{
cnt = 0;
while (cnt != 1)
{
if ((cnt += read(fd, &prefix[2], 1)) < 0)
{
perror("Error reading packet header from robot connection: P2OSPacket():Receive():read():");
return (1);
}
}
if (prefix[0] == 0xFA && prefix[1] == 0xFB)
{
break;
}
GlobalTime->GetTimeDouble(&timestamp);
prefix[0] = prefix[1];
prefix[1] = prefix[2];
}

This fragment implements a shift register of the size 3.
The oldest value is in prefix[0] and the latest in prefix[2]. That's why the address of prefix[2] is passed to the function read().
The loop is left, when the previous two bytes have been FA FB, the current (last received) byte is in prefix[2]. The function is left before that point, if nothing could be read (the return value of read differs from 1).
This shift register is used when you can't predict if other bytes are received in front of the sync characters FA FB. Reading three bytes with each read operation would not allow to synchronize somewhere in a data stream.

The call read(fd, &prefix[2], 1) just means "store a single byte in prefix[2]".
In general, &a[n] is the same address as (&a) + n

It is reading into an array called prefix, starting at an offset two places from the start of the array. In this case, only a single character is being read, but one could read more.

It looks like this piece of code is reading a serial communications stream waiting for the start of a header which is marked with FA, FB. The while loop reads characters singly into prefix[2] and shuffles them backwards through the array.
I think that the use of &prefix[2] is a trick which enables the next character in the header to appear in the prefix array when the while loop quits through the break.

Related

Writing a program for a computer that uses Litttle or Big endian. And have the same result [duplicate]

This question already has answers here:
Detecting endianness programmatically in a C++ program
(29 answers)
Closed 2 years ago.
This question is about endian's.
Goal is to write 2 bytes in a file for a game on a computer. I want to make sure that people with different computers have the same result whether they use Little- or Big-Endian.
Which of these snippet do I use?
char a[2] = { 0x5c, 0x7B };
fout.write(a, 2);
or
int a = 0x7B5C;
fout.write((char*)&a, 2);
Thanks a bunch.
From wikipedia:
In its most common usage, endianness indicates the ordering of bytes within a multi-byte number.
So for char a[2] = { 0x5c, 0x7B };, a[1] will be always 0x7B
However, for int a = 0x7B5C;, char* oneByte = (char*)&a; (char *)oneByte[0]; may be 0x7B or 0x5C, but as you can see, you have to play with casts and byte pointers (bear in mind the undefined behaviour when char[1], it is only for explanation purposes).
One way that is used quite often is to write a 'signature' or 'magic' number as the first data in the file - typically a 16-bit integer whose value, when read back, will depend on whether or not the reading platform has the same endianness as the writing platform. If you then detect a mismatch, all data (of more than one byte) read from the file will need to be byte swapped.
Here's some outline code:
void ByteSwap(void *buffer, size_t length)
{
unsigned char *p = static_cast<unsigned char *>(buffer);
for (size_t i = 0; i < length / 2; ++i) {
unsigned char tmp = *(p + i);
*(p + i) = *(p + length - i - 1);
*(p + length - i - 1) = tmp;
}
return;
}
bool WriteData(void *data, size_t size, size_t num, FILE *file)
{
uint16_t magic = 0xAB12; // Something that can be tested for byte-reversal
if (fwrite(&magic, sizeof(uint16_t), 1, file) != 1) return false;
if (fwrite(data, size, num, file) != num) return false;
return true;
}
bool ReadData(void *data, size_t size, size_t num, FILE *file)
{
uint16_t test_magic;
bool is_reversed;
if (fread(&test_magic, sizeof(uint16_t), 1, file) != 1) return false;
if (test_magic == 0xAB12) is_reversed = false;
else if (test_magic == 0x12AB) is_reversed = true;
else return false; // Error - needs handling!
if (fread(data, size, num, file) != num) return false;
if (is_reversed && (size > 1)) {
for (size_t i = 0; i < num; ++i) ByteSwap(static_cast<char *>(data) + (i*size), size);
}
return true;
}
Of course, in the real world, you wouldn't need to write/read the 'magic' number for every input/output operation - just once per file, and store the is_reversed flag for future use when reading data back.
Also, with proper use of C++ code, you would probably be using std::stream arguments, rather than the FILE* I have shown - but the sample I have posted has been extracted (with only very little modification) from code that I actually use in my projects (to do just this test). But conversion to better use of modern C++ should be straightforward.
Feel free to ask for further clarification and/or explanation.
NOTE: The ByteSwap function I have provided is not ideal! It almost certainly breaks strict aliasing rules and may well cause undefined behaviour on some platforms, if used carelessly. Also, it is not the most efficient method for small data units (like int variables). One could (and should) provide one's own byte-reversal function(s) to handle specific types of variables - a good case for overloading the function with different argument types).
Which of these snippet do I use?
The first one. It has same output regardless of native endianness.
But you'll find that if you need to interpret those bytes as some integer value, that is not so straightforward. char a[2] = { 0x5c, 0x7B } can represent either 0x5c7B (big endian) or 0x7B5c (little endian). So, which one did you intend?
The solution for cross platform interpretation of integers is to decide on particular byte order for the reading and writing. De-facto "standard" for cross platform data is to use big endian.
To write a number in big endian, start by bit-shifting the input value right so that the most significant byte is in the place of the least significant byte. Mask all other bytes (technically redundant in first iteration, but we'll loop back soon). Write this byte to the output. Repeat this for all other bytes in order of significance.
This algorithm produces same output regardless of the native endianness - it will even work on exotic "middle" endian systems if you ever encounter one. Writing to little endian is similar, but in reverse order.
To read a big endian value, read the first byte of input, shift it left so that it goes to the place of most significant byte. Combine the shifted byte with the result (initially zero) using bitwise-or. Repeat with the next byte by shifting to the second most significant place and so on.
to know the Endianess of a computer?
To know endianness of a system, you can use std::endian in the upcoming C++20. Prior to that, you can use implementation specific macros from endian.h header. Or you can do a simple calculation like you suggest.
But you never really need to know the endianness of a system. You can simply use the algorithms that I described, which work on systems of all endianness without having to know what that endianness is.

Arduino Void loop() does not loop

I am new to Arduino. I'm looking at Makecourse tutorial on RC522 RFID reader/writer
I pasted the script below. Basically, it detects a card then writes some data into block 2.
After compiling and uploading the code, I put up one card to the RFID reader, it works - but just once. When i put up a second card, nothing happens. Why?
I've compared this script with other example scripts from the mfrc522 library, they are pretty similar - in the void loop() section, it checks if NewCardPresent... ReadCardSerial... then proceeds to run the intended action. I've tried those example scripts and I can keep presenting a new card to the reader and the script will run again.
However this sketch, the loops only ran once. Is it the structure? How can i edit it to make it run continuously? I apologise if the answer is very simple :/
#include <SPI.h>//include the SPI bus library
#include <MFRC522.h>//include the RFID reader library
#define SS_PIN 10 //slave select pin
#define RST_PIN 5 //reset pin
MFRC522 mfrc522(SS_PIN, RST_PIN); // instatiate a MFRC522 reader object.
MFRC522::MIFARE_Key key;//create a MIFARE_Key struct named 'key', which will hold the card information
void setup() {
Serial.begin(9600); // Initialize serial communications with the PC
SPI.begin(); // Init SPI bus
mfrc522.PCD_Init(); // Init MFRC522 card (in case you wonder what PCD means: proximity coupling device)
Serial.println("Scan a MIFARE Classic card");
// Prepare the security key for the read and write functions - all six key bytes are set to 0xFF at chip delivery from the factory.
// Since the cards in the kit are new and the keys were never defined, they are 0xFF
// if we had a card that was programmed by someone else, we would need to know the key to be able to access it. This key would then need to be stored in 'key' instead.
for (byte i = 0; i < 6; i++) {
key.keyByte[i] = 0xFF;//keyByte is defined in the "MIFARE_Key" 'struct' definition in the .h file of the library
}
}
int block=2;//this is the block number we will write into and then read. Do not write into 'sector trailer' block, since this can make the block unusable.
byte blockcontent[16] = {"makecourse_____"};//an array with 16 bytes to be written into one of the 64 card blocks is defined
//byte blockcontent[16] = {0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0};//all zeros. This can be used to delete a block.
byte readbackblock[18];//This array is used for reading out a block. The MIFARE_Read method requires a buffer that is at least 18 bytes to hold the 16 bytes of a block.
void loop()
{
/*****************************************establishing contact with a tag/card**********************************************************************/
// Look for new cards (in case you wonder what PICC means: proximity integrated circuit card)
if ( ! mfrc522.PICC_IsNewCardPresent()) {//if PICC_IsNewCardPresent returns 1, a new card has been found and we continue
return;//if it did not find a new card is returns a '0' and we return to the start of the loop
}
// Select one of the cards
if ( ! mfrc522.PICC_ReadCardSerial()) {//if PICC_ReadCardSerial returns 1, the "uid" struct (see MFRC522.h lines 238-45)) contains the ID of the read card.
return;//if it returns a '0' something went wrong and we return to the start of the loop
}
// Among other things, the PICC_ReadCardSerial() method reads the UID and the SAK (Select acknowledge) into the mfrc522.uid struct, which is also instantiated
// during this process.
// The UID is needed during the authentication process
//The Uid struct:
//typedef struct {
//byte size; // Number of bytes in the UID. 4, 7 or 10.
//byte uidByte[10]; //the user ID in 10 bytes.
//byte sak; // The SAK (Select acknowledge) byte returned from the PICC after successful selection.
//} Uid;
Serial.println("card selected");
/*****************************************writing and reading a block on the card**********************************************************************/
writeBlock(block, blockcontent);//the blockcontent array is written into the card block
//mfrc522.PICC_DumpToSerial(&(mfrc522.uid));
//The 'PICC_DumpToSerial' method 'dumps' the entire MIFARE data block into the serial monitor. Very useful while programming a sketch with the RFID reader...
//Notes:
//(1) MIFARE cards conceal key A in all trailer blocks, and shows 0x00 instead of 0xFF. This is a secutiry feature. Key B appears to be public by default.
//(2) The card needs to be on the reader for the entire duration of the dump. If it is removed prematurely, the dump interrupts and an error message will appear.
//(3) The dump takes longer than the time alloted for interaction per pairing between reader and card, i.e. the readBlock function below will produce a timeout if
// the dump is used.
//mfrc522.PICC_DumpToSerial(&(mfrc522.uid));//uncomment this if you want to see the entire 1k memory with the block written into it.
readBlock(block, readbackblock);//read the block back
Serial.print("read block: ");
for (int j=0 ; j<16 ; j++)//print the block contents
{
Serial.write (readbackblock[j]);//Serial.write() transmits the ASCII numbers as human readable characters to serial monitor
}
Serial.println("");
}
and this is the function that came along with the sketch:
int writeBlock(int blockNumber, byte arrayAddress[])
{
//this makes sure that we only write into data blocks. Every 4th block is a trailer block for the access/security info.
int largestModulo4Number=blockNumber/4*4;
int trailerBlock=largestModulo4Number+3;//determine trailer block for the sector
if (blockNumber > 2 && (blockNumber+1)%4 == 0){Serial.print(blockNumber);Serial.println(" is a trailer block:");return 2;}//block number is a trailer block (modulo 4); quit and send error code 2
Serial.print(blockNumber);
Serial.println(" is a data block:");
/*****************************************authentication of the desired block for access***********************************************************/
byte status = mfrc522.PCD_Authenticate(MFRC522::PICC_CMD_MF_AUTH_KEY_A, trailerBlock, &key, &(mfrc522.uid));
//byte PCD_Authenticate(byte command, byte blockAddr, MIFARE_Key *key, Uid *uid);
//this method is used to authenticate a certain block for writing or reading
//command: See enumerations above -> PICC_CMD_MF_AUTH_KEY_A = 0x60 (=1100000), // this command performs authentication with Key A
//blockAddr is the number of the block from 0 to 15.
//MIFARE_Key *key is a pointer to the MIFARE_Key struct defined above, this struct needs to be defined for each block. New cards have all A/B= FF FF FF FF FF FF
//Uid *uid is a pointer to the UID struct that contains the user ID of the card.
if (status != MFRC522::STATUS_OK) {
Serial.print("PCD_Authenticate() failed: ");
Serial.println(mfrc522.GetStatusCodeName(status));
return 3;//return "3" as error message
}
//it appears the authentication needs to be made before every block read/write within a specific sector.
//If a different sector is being authenticated access to the previous one is lost.
/*****************************************writing the block***********************************************************/
status = mfrc522.MIFARE_Write(blockNumber, arrayAddress, 16);//valueBlockA is the block number, MIFARE_Write(block number (0-15), byte array containing 16 values, number of bytes in block (=16))
//status = mfrc522.MIFARE_Write(9, value1Block, 16);
if (status != MFRC522::STATUS_OK) {
Serial.print("MIFARE_Write() failed: ");
Serial.println(mfrc522.GetStatusCodeName(status));
return 4;//return "4" as error message
}
Serial.println("block was written");
}
int readBlock(int blockNumber, byte arrayAddress[])
{
int largestModulo4Number=blockNumber/4*4;
int trailerBlock=largestModulo4Number+3;//determine trailer block for the sector
/*****************************************authentication of the desired block for access***********************************************************/
byte status = mfrc522.PCD_Authenticate(MFRC522::PICC_CMD_MF_AUTH_KEY_A, trailerBlock, &key, &(mfrc522.uid));
//byte PCD_Authenticate(byte command, byte blockAddr, MIFARE_Key *key, Uid *uid);
//this method is used to authenticate a certain block for writing or reading
//command: See enumerations above -> PICC_CMD_MF_AUTH_KEY_A = 0x60 (=1100000), // this command performs authentication with Key A
//blockAddr is the number of the block from 0 to 15.
//MIFARE_Key *key is a pointer to the MIFARE_Key struct defined above, this struct needs to be defined for each block. New cards have all A/B= FF FF FF FF FF FF
//Uid *uid is a pointer to the UID struct that contains the user ID of the card.
if (status != MFRC522::STATUS_OK) {
Serial.print("PCD_Authenticate() failed (read): ");
Serial.println(mfrc522.GetStatusCodeName(status));
return 3;//return "3" as error message
}
//it appears the authentication needs to be made before every block read/write within a specific sector.
//If a different sector is being authenticated access to the previous one is lost.
/*****************************************reading a block***********************************************************/
byte buffersize = 18;//we need to define a variable with the read buffer size, since the MIFARE_Read method below needs a pointer to the variable that contains the size...
status = mfrc522.MIFARE_Read(blockNumber, arrayAddress, &buffersize);//&buffersize is a pointer to the buffersize variable; MIFARE_Read requires a pointer instead of just a number
if (status != MFRC522::STATUS_OK) {
Serial.print("MIFARE_read() failed: ");
Serial.println(mfrc522.GetStatusCodeName(status));
return 4;//return "4" as error message
}
Serial.println("block was read");
}
Arduino Loop() is supposed to be called infinitely unlike you observed.
Therefore you need to check two possibilities as below.
1) any return is happening at the beginning of the loop(). I saw two return statements. You'd better insert debug messages among them so that you could know how far reached before returning this function.
2) any blocking is happing in the loop(). I don't know but you'd better check this as well.
Turns out adding this few lines at the end of the script resolved the issue:
delay(1000);
mfrc522.PICC_HaltA();
mfrc522.PCD_StopCrypto1();
:)
I think the problem is not with the loop. It is ruining continuously. You can check it by using some serial print in the beginning of the loop. Problem is after a card is detected the first time this program doesn't identify a card the second time.So focus on that. The loop is fine.

zlib inflate stream and avail_in

Part of an application I'm working on involves receiving a compressed data stream in zlib (deflate) format, piece by piece over a socket. The routine is basically to receive the compressed data in chunks, and pass it to inflate as more data becomes available. When inflate returns Z_STREAM_END we know the full object has arrived.
A very simplified version of the basic C++ inflater function is as follows:
void inflater::inflate_next_chunk(void* chunk, std::size_t size)
{
m_strm.avail_in = size;
m_strm.next_in = chunk;
m_strm.next_out = m_buffer;
int ret = inflate(&m_strm, Z_NO_FLUSH);
/* ... check errors, etc. ... */
}
Except strangely, every like... 40 or so times, inflate will fail with a Z_DATA_ERROR.
According to the zlib manual, a Z_DATA_ERROR indicates a "corrupt or incomplete" stream. Obviously, there are any number of ways the data could be getting corrupted in my application that are way beyond the scope of this question - but after some tinkering around, I realized that the call to inflate would return Z_DATA_ERROR if m_strm.avail_in was not 0 before I set it to size. In other words, it seems that inflate is failing because there is already data in the stream before I set avail_in.
But my understanding is that every call to inflate should completely empty the input stream, meaning that when I call inflate again, I shouldn't have to worry if it didn't finish up with the last call. Is my understanding correct here? Or do I always need to check strm.avail_in to see if there is pending input?
Also, why would there ever be pending input? Why doesn't inflate simply consume all available input with with each call?
inflate() can return because it has filled the output buffer but not consumed all of the input data. If this happens you need to provide a new output buffer and call inflate() again until m_strm.avail.in == 0.
The zlib manual has this to say...
The detailed semantics are as follows. inflate performs one or both of
the following actions:
Decompress more input starting at next_in and update next_in and
avail_in accordingly. If not all input can be processed (because there
is not enough room in the output buffer), next_in is updated and
processing will resume at this point for the next call of inflate().
You appear to be assuming that your compressed input will always fit in your output buffer space, that's not always the case...
My wrapper code looks like this...
bool CDataInflator::Inflate(
const BYTE * const pDataIn,
DWORD &dataInSize,
BYTE *pDataOut,
DWORD &dataOutSize)
{
if (pDataIn)
{
if (m_stream.avail_in == 0)
{
m_stream.avail_in = dataInSize;
m_stream.next_in = const_cast<BYTE * const>(pDataIn);
}
else
{
throw CException(
_T("CDataInflator::Inflate()"),
_T("No space for input data"));
}
}
m_stream.avail_out = dataOutSize;
m_stream.next_out = pDataOut;
bool done = false;
do
{
int result = inflate(&m_stream, Z_BLOCK);
if (result < 0)
{
ThrowOnFailure(_T("CDataInflator::Inflate()"), result);
}
done = (m_stream.avail_in == 0 ||
(dataOutSize != m_stream.avail_out &&
m_stream.avail_out != 0));
}
while (!done && m_stream.avail_out == dataOutSize);
dataInSize = m_stream.avail_in;
dataOutSize = dataOutSize - m_stream.avail_out;
return done;
}
Note the loop and the fact that the caller relies on the dataInSize to know when all of the current input data has been consumed. If the output space is filled then the caller calls again using Inflate(0, 0, pNewBuffer, newBufferSize); to provide more buffer space...
Consider wrapping the inflate() call in a do-while loop until the stream's avail_out is not empty (i.e., some data have been extracted):
m_strm.avail_in = fread(compressed_data_buffer, 1, some_chunk_size / 8, some_file_pointer);
m_strm.next_in = compressed_data_buffer;
do {
m_strm.avail_out = some_chunk_size;
m_strm.next_out = inflated_data_buffer;
int ret = inflate(&m_strm, Z_NO_FLUSH);
/* error checking... */
} while (m_strm.avail_out == 0);
inflated_bytes = some_chunk_size - m_strm.avail_out;
Without debugging the internal workings of inflate(), I suspect it may on occasion simply need to run more than once before it can extract usable data.

fast reading constant data stream from serial port in C++.net

I'm trying to establish a SerialPort connection which transfers 16 bit data packages at a rate of 10-20 kHz. Im programming this in C++/CLI. The sender just enters an infinte while-loop after recieving the letter "s" and constantly sends 2 bytes with the data.
A Problem with the sending side is very unlikely, since a more simple approach works perfectly but too slow (in this approach, the reciever sends always an "a" first, and then gets 1 package consisting of 2 bytes. It leads to a speed of around 500Hz).
Here is the important part of this working but slow approach:
public: SerialPort^ port;
in main:
Parity p = (Parity)Enum::Parse(Parity::typeid, "None");
StopBits s = (StopBits)Enum::Parse(StopBits::typeid, "1");
port = gcnew SerialPort("COM16",384000,p,8,s);
port->Open();
and then doing as often as wanted:
port->Write("a");
int i = port->ReadByte();
int j = port->ReadByte();
This is now the actual approach im working with:
static int values[1000000];
static int counter = 0;
void reader(void)
{
SerialPort^ port;
Parity p = (Parity)Enum::Parse(Parity::typeid, "None");
StopBits s = (StopBits)Enum::Parse(StopBits::typeid, "1");
port = gcnew SerialPort("COM16",384000,p,8,s);
port->Open();
unsigned int i = 0;
unsigned int j = 0;
port->Write("s"); //with this command, the sender starts to send constantly
while(true)
{
i = port->ReadByte();
j = port->ReadByte();
values[counter] = j + (i*256);
counter++;
}
}
in main:
Thread^ readThread = gcnew Thread(gcnew ThreadStart(reader));
readThread->Start();
The counter increases (much more) rapidly at a rate of 18472 packages/s, but the values are somehow wrong.
Here is an example:
The value should look like this, with the last 4 bits changing randomly (its a signal of an analogue-digital converter):
111111001100111
Here are some values of the threaded solution given in the code:
1110011001100111
1110011000100111
1110011000100111
1110011000100111
So it looks like the connection reads the data in the middle of the package (to be exact: 3 bits too late). What can i do? I want to avoid a solution where this error is fixed later in the code while reading the packages like this, because I don't know if the the shifting error gets worse when I edit the reading code later, which I will do most likely.
Thanks in advance,
Nikolas
PS: If this helps, here is the code of the sender-side (an AtMega168), written in C.
uint8_t activate = 0;
void uart_puti16(uint16_t val) //function that writes the data to serial port
{
while ( !( UCSR0A & (1<<UDRE0)) ) //wait until serial port is ready
nop(); // wait 1 cycle
UDR0 = val >> 8; //write first byte to sending register
while ( !( UCSR0A & (1<<UDRE0)) ) //wait until serial port is ready
nop(); // wait 1 cycle
UDR0 = val & 0xFF; //write second byte to sending register
}
in main:
while(1)
{
if(active == 1)
{
uart_puti16(read()); //read is the function that gives a 16bit data set
}
}
ISR(USART_RX_vect) //interrupt-handler for a recieved byte
{
if(UDR0 == 'a') //if only 1 single data package is requested
{
uart_puti16(read());
}
if(UDR0 == 's') //for activating constant sending
{
active = 1;
}
if(UDR0 == 'e') //for deactivating constant sending
{
active = 0;
}
}
At the given bit rate of 384,000 you should get 38,400 bytes of data (8 bits of real data plus 2 framing bits) per second, or 19,200 two-byte values per second.
How fast is counter increasing in both instances? I would expect any modern computer to keep up with that rate whether using events or directly polling.
You do not show your simpler approach which is stated to work. I suggest you post that.
Also, set a breakpoint at the line
values[counter] = j + (i*256);
There, inspect i and j. Share the values you see for those variables on the very first iteration through the loop.
This is a guess based entirely on reading the code at http://msdn.microsoft.com/en-us/library/system.io.ports.serialport.datareceived.aspx#Y228. With this caveat out of the way, here's my guess:
Your event handler is being called when data is available to read -- but you are only consuming two bytes of the available data. Your event handler may only be called every 1024 bytes. Or something similar. You might need to consume all the available data in the event handler for your program to continue as expected.
Try to re-write your handler to include a loop that reads until there is no more data available to consume.

c++ stl: a good way to parse a sensor response

Attention please:
I already implemented this stuff, just not in any way generic or elegant. This question is motivated by my wanting to learn more tricks with the stl, not the problem itself.
This I think is clear in the way I stated that I already solved the problem, but many people have answered in their best intentions with solutions to the problem, not answers to the question "how to solve this the stl way". I am really sorry if I phrased this question in a confusing way. I hate to waste people's time.
Ok, here it comes:
I get a string full of encoded Data.
It comes in:
N >> 64 bytes
every 3 byte get decoded into an int value
after at most 64 byte (yes,not divisible by 3!) comes a byte as checksum
followed by a line feed.
and so it goes on.
It ends when 2 successive linefeeds are found.
It looks like a nice or at least ok data format, but parsing it elegantly
the stl way is a real bit**.
I have done the thing "manually".
But I would be interested if there is an elegant way with the stl- or maybe boost- magic that doesn't incorporate copying the thing.
Clarification:
It gets really big sometimes. The N >> 64byte was more like a N >>> 64 byte ;-)
UPDATE
Ok, the N>64 bytes seems to be confusing. It is not important.
The sensor takes M measurements as integers. Encodes each of them into 3 bytes. and sends them one after another
when the sensor has sent 64byte of data, it inserts a checksum over the 64 byte and an LF. It doesn't care if one of the encoded integers is "broken up" by that. It just continues in the next line.(That has only the effect to make the data nicely human readable but kindof nasty to parse elegantly.)
if it has finished sending data it inserts a checksum-byte and LFLF
So one data chunk can look like this, for N=129=43x3:
|<--64byte-data-->|1byte checksum|LF
|<--64byte-data-->|1byte checksum|LF
|<--1byte-data-->|1byte checksum|LF
LF
When I have M=22 measurements, this means I have N=66 bytes of data.
After 64 byte it inserts the checksum and LF and continues.
This way it breaks up my last measurement
which is encoded in byte 64, 65 and 66. It now looks like this: 64, checksum, LF, 65, 66.
Since a multiple of 3 divided by 64 carries a residue 2 out of 3 times, and everytime
another one, it is nasty to parse.
I had 2 solutions:
check checksum, concatenate data to one string that only has data bytes, decode.
run through with iterators and one nasty if construct to avoid copying.
I just thought there might be someting better. I mused about std::transform, but it wouldn't work because of the 3 byte is one int thing.
As much as I like STL, I don't think there's anything wrong with doing things manually, especially if the problem does not really fall into the cases the STL has been made for. Then again, I'm not sure why you ask. Maybe you need an STL input iterator that (checks and) discards the check sums and LF characters and emits the integers?
I assume the encoding is such that LF can only appear at those places, i.e., some kind of Base-64 or similar?
It seems to me that something as simple as the following should solve the problem:
string line;
while( getline( input, line ) && line != "" ) {
int val = atoi( line.substr(0, 3 ).c_str() );
string data = line.substr( 3, line.size() - 4 );
char csum = line[ line.size() - 1 ];
// process val, data and csum
}
In a real implementation you would want to add error checking, but the basic logic should remain the same.
As others have said, there is no silver bullet in stl/boost to elegantly solve your problem. If you want to parse your chunk directly via pointer arithmetic, perhaps you can take inspiration from std::iostream and hide the messy pointer arithmetic in a custom stream class. Here's a half-arsed solution I came up with:
#include <cctype>
#include <iostream>
#include <vector>
#include <boost/lexical_cast.hpp>
class Stream
{
public:
enum StateFlags
{
goodbit = 0,
eofbit = 1 << 0, // End of input packet
failbit = 1 << 1 // Corrupt packet
};
Stream() : state_(failbit), csum_(0), pos_(0), end_(0) {}
Stream(char* begin, char* end) {open(begin, end);}
void open(char* begin, char* end)
{state_=goodbit; csum_=0; pos_=begin, end_=end;}
StateFlags rdstate() const {return static_cast<StateFlags>(state_);}
bool good() const {return state_ == goodbit;}
bool fail() const {return (state_ & failbit) != 0;}
bool eof() const {return (state_ & eofbit) != 0;}
Stream& read(int& measurement)
{
measurement = readDigit() * 100;
measurement += readDigit() * 10;
measurement += readDigit();
return *this;
}
private:
int readDigit()
{
int digit = 0;
// Check if we are at end of packet
if (pos_ == end_) {state_ |= eofbit; return 0;}
/* We should be at least csum|lf|lf away from end, and we are
not expecting csum or lf here. */
if (pos_+3 >= end_ || pos_[0] == '\n' || pos_[1] == '\n')
{
state_ |= failbit;
return 0;
}
if (!getDigit(digit)) {return 0;}
csum_ = (csum_ + digit) % 10;
++pos_;
// If we are at checksum, check and consume it, along with linefeed
if (pos_[1] == '\n')
{
int checksum = 0;
if (!getDigit(checksum) || (checksum != csum_)) {state_ |= failbit;}
csum_ = 0;
pos_ += 2;
// If there is a second linefeed, we are at end of packet
if (*pos_ == '\n') {pos_ = end_;}
}
return digit;
}
bool getDigit(int& digit)
{
bool success = std::isdigit(*pos_);
if (success)
digit = boost::lexical_cast<int>(*pos_);
else
state_ |= failbit;
return success;
}
int csum_;
unsigned int state_;
char* pos_;
char* end_;
};
int main()
{
// Use (8-byte + csum + LF) fragments for this example
char data[] = "\
001002003\n\
300400502\n\
060070081\n\n";
std::vector<int> measurements;
Stream s(data, data + sizeof(data));
int meas = 0;
while (s.read(meas).good())
{
measurements.push_back(meas);
std::cout << meas << " ";
}
return 0;
}
Maybe you'll want to add extra StateFlags to determine if failure is due to checksum error or framing error. Hope this helps.
You should think of your communication protocol as being layered. Treat
|<--64byte-data-->|1byte checksum|LF
as fragments to be reassembled into larger packets of contiguous data. Once the larger packet is reconstituted, it is easier to parse its data contiguously (you don't have to deal with measurements being split up across fragments). Many existing network protocols (such as UDP/IP) does this sort of reassembly of fragments into packets.
It's possible to read the fragments directly into their proper "slot" in the packet buffer. Since your fragments have footers instead of headers, and there is no out-of-order arrival of your fragments, this should be fairly easy to code (compared to copyless IP reassembly algorithms). Once you receive an "empty" fragment (the duplicate LF), this marks the end of the packet.
Here is some sample code to illustrate the idea:
#include <vector>
#include <cassert>
class Reassembler
{
public:
// Constructs reassembler with given packet buffer capacity
Reassembler(int capacity) : buf_(capacity) {reset();}
// Returns bytes remaining in packet buffer
int remaining() const {return buf_.end() - pos_;}
// Returns a pointer to where the next fragment should be read
char* back() {return &*pos_;}
// Advances the packet's position cursor for the next fragment
void push(int size) {pos_ += size; if (size == 0) complete_ = true;}
// Returns true if an empty fragment was pushed to indicate end of packet
bool isComplete() const {return complete_;}
// Resets the reassembler so it can process a new packet
void reset() {pos_ = buf_.begin(); complete_ = false;}
// Returns a pointer to the accumulated packet data
char* data() {return &buf_[0];}
// Returns the size in bytes of the accumulated packet data
int size() const {return pos_ - buf_.begin();}
private:
std::vector<char> buf_;
std::vector<char>::iterator pos_;
bool complete_;
};
int readFragment(char* dest, int maxBytes, char delimiter)
{
// Read next fragment from source and save to dest pointer
// Return number of bytes in fragment, except delimiter character
}
bool verifyChecksum(char* fragPtr, int size)
{
// Returns true if fragment checksum is valid
}
void processPacket(char* data, int size)
{
// Extract measurements which are now stored contiguously in packet
}
int main()
{
const int kChecksumSize = 1;
Reassembler reasm(1000); // Use realistic capacity here
while (true)
{
while (!reasm.isComplete())
{
char* fragDest = reasm.back();
int fragSize = readFragment(fragDest, reasm.remaining(), '\n');
if (fragSize > 1)
assert(verifyChecksum(fragDest, fragSize));
reasm.push(fragSize - kChecksumSize);
}
processPacket(reasm.data(), reasm.size());
reasm.reset();
}
}
The trick will be making an efficient readFragment function that stops at every newline delimiter and stores the incoming data into the given destination buffer pointer. If you tell me how you acquire your sensor data, then I can perhaps give you more ideas.
An elegant solution this isn't. It would be more so by using a "transition matrix", and only reading one character at a time. Not my style. Yet this code has a minimum of redundant data movement, and it seems to do the job. Minimally C++, it really is just a C program. Adding iterators is left as an exercise for the reader. The data stream wasn't completely defined, and there was no defined destination for the converted data. Assumptions noted in comments. Lots of printing should show functionality.
// convert series of 3 ASCII decimal digits to binary
// there is a checksum byte at least once every 64 bytes - it can split a digit series
// if the interval is less than 64 bytes, it must be followd by LF (to identify it)
// if the interval is a full 64 bytes, the checksum may or may not be followed by LF
// checksum restricted to a simple sum modulo 10 to keep ASCII format
// checksum computations are only printed to allowed continuation of demo, and so results can be
// inserted back in data for testing
// there is no verification of the 3 byte sets of digits
// results are just printed, non-zero return indicates error
int readData(void) {
int binValue = 0, digitNdx = 0, sensorCnt = 0, lineCnt = 0;
char oneDigit;
string sensorTxt;
while( getline( cin, sensorTxt ) ) {
int i, restart = 0, checkSum = 0, size = sensorTxt.size()-1;
if(size < 0)
break;
lineCnt++;
if(sensorTxt[0] == '#')
continue;
printf("INPUT: %s\n", &sensorTxt[0]); // gag
while(restart<size) {
for(i=0; i<min(64, size); i++) {
oneDigit = sensorTxt[i+restart] & 0xF;
checkSum += oneDigit;
binValue = binValue*10 + oneDigit;
//printf("%3d-%X ", binValue, sensorTxt[i+restart]);
digitNdx++;
if(digitNdx == 3) {
sensorCnt++;
printf("READING# %d (LINE %d) = %d CKSUM %d\n",
sensorCnt, lineCnt, binValue, checkSum);
digitNdx = 0;
binValue = 0;
}
}
oneDigit = sensorTxt[i+restart] & 0x0F;
char compCheckDigit = (10-(checkSum%10)) % 10;
printf(" CKSUM at sensorCnt %d ", sensorCnt);
if((checkSum+oneDigit) % 10)
printf("ERR got %c exp %c\n", oneDigit|0x30, compCheckDigit|0x30);
else
printf("OK\n");
i++;
restart += i;
}
}
if(digitNdx)
return -2;
else
return 0;
}
The data definition was extended with comments, you you can use the following as is:
# normal 64 byte lines with 3 digit value split across lines
00100200300400500600700800901001101201301401501601701801902002105
22023024025026027028029030031032033034035036037038039040041042046
# short lines, partial values - remove checksum digit to combine short lines
30449
0451
0460479
0480490500510520530540550560570580590600610620630641
# long line with embedded checksums every 64 bytes
001002003004005006007008009010011012013014015016017018019020021052202302402502602702802903003103203303403503603703803904004104204630440450460470480490500510520530540550560570580590600610620630640
# dangling digit at end of file (with OK checksum)
37
Why are you concerned with copying? Is it the time overhead or the space overhead?
It sounds like you are reading all of the unparsed data into a big buffer, and now you want to write code that makes the big buffer of unparsed data look like a slightly smaller buffer of parsed data (the data minus the checksums and linefeeds), to avoid the space overhead involved in copying it into the slightly smaller buffer.
Adding a complicated layer of abstraction isn't going to help with the time overhead unless you only need a small portion of the data. And if that's the case, maybe you could just figure out which small portion you need, and copy that. (Then again, most of the abstraction layer may be already written for you, e.g. the Boost iterator library.)
An easier way to reduce the space overhead is to read the data in smaller chunks (or a line at a time), and parse it as you go. Then you only need to store the parsed version in a big buffer. (This assumes you're reading it from a file / socket / port, rather than being passed a large buffer that is not under your control.)
Another way to reduce the space overhead is to overwrite the data in place as you parse it. You will incur the cost of copying it, but you will only need one buffer. (This assumes that the 64-byte data doesn't grow when you parse it, i.e. it is not compressed.)