zlib inflate stream and avail_in

zlib inflate stream and avail_in - c++

Part of an application I'm working on involves receiving a compressed data stream in zlib (deflate) format, piece by piece over a socket. The routine is basically to receive the compressed data in chunks, and pass it to inflate as more data becomes available. When inflate returns Z_STREAM_END we know the full object has arrived.
A very simplified version of the basic C++ inflater function is as follows:
void inflater::inflate_next_chunk(void* chunk, std::size_t size)
{
m_strm.avail_in = size;
m_strm.next_in = chunk;
m_strm.next_out = m_buffer;
int ret = inflate(&m_strm, Z_NO_FLUSH);
/* ... check errors, etc. ... */
}
Except strangely, every like... 40 or so times, inflate will fail with a Z_DATA_ERROR.
According to the zlib manual, a Z_DATA_ERROR indicates a "corrupt or incomplete" stream. Obviously, there are any number of ways the data could be getting corrupted in my application that are way beyond the scope of this question - but after some tinkering around, I realized that the call to inflate would return Z_DATA_ERROR if m_strm.avail_in was not 0 before I set it to size. In other words, it seems that inflate is failing because there is already data in the stream before I set avail_in.
But my understanding is that every call to inflate should completely empty the input stream, meaning that when I call inflate again, I shouldn't have to worry if it didn't finish up with the last call. Is my understanding correct here? Or do I always need to check strm.avail_in to see if there is pending input?
Also, why would there ever be pending input? Why doesn't inflate simply consume all available input with with each call?

inflate() can return because it has filled the output buffer but not consumed all of the input data. If this happens you need to provide a new output buffer and call inflate() again until m_strm.avail.in == 0.
The zlib manual has this to say...
The detailed semantics are as follows. inflate performs one or both of
the following actions:
Decompress more input starting at next_in and update next_in and
avail_in accordingly. If not all input can be processed (because there
is not enough room in the output buffer), next_in is updated and
processing will resume at this point for the next call of inflate().
You appear to be assuming that your compressed input will always fit in your output buffer space, that's not always the case...
My wrapper code looks like this...
bool CDataInflator::Inflate(
const BYTE * const pDataIn,
DWORD &dataInSize,
BYTE *pDataOut,
DWORD &dataOutSize)
{
if (pDataIn)
{
if (m_stream.avail_in == 0)
{
m_stream.avail_in = dataInSize;
m_stream.next_in = const_cast<BYTE * const>(pDataIn);
}
else
{
throw CException(
_T("CDataInflator::Inflate()"),
_T("No space for input data"));
}
}
m_stream.avail_out = dataOutSize;
m_stream.next_out = pDataOut;
bool done = false;
do
{
int result = inflate(&m_stream, Z_BLOCK);
if (result < 0)
{
ThrowOnFailure(_T("CDataInflator::Inflate()"), result);
}
done = (m_stream.avail_in == 0 ||
(dataOutSize != m_stream.avail_out &&
m_stream.avail_out != 0));
}
while (!done && m_stream.avail_out == dataOutSize);
dataInSize = m_stream.avail_in;
dataOutSize = dataOutSize - m_stream.avail_out;
return done;
}
Note the loop and the fact that the caller relies on the dataInSize to know when all of the current input data has been consumed. If the output space is filled then the caller calls again using Inflate(0, 0, pNewBuffer, newBufferSize); to provide more buffer space...

Consider wrapping the inflate() call in a do-while loop until the stream's avail_out is not empty (i.e., some data have been extracted):
m_strm.avail_in = fread(compressed_data_buffer, 1, some_chunk_size / 8, some_file_pointer);
m_strm.next_in = compressed_data_buffer;
do {
m_strm.avail_out = some_chunk_size;
m_strm.next_out = inflated_data_buffer;
int ret = inflate(&m_strm, Z_NO_FLUSH);
/* error checking... */
} while (m_strm.avail_out == 0);
inflated_bytes = some_chunk_size - m_strm.avail_out;
Without debugging the internal workings of inflate(), I suspect it may on occasion simply need to run more than once before it can extract usable data.

Related

How to use AVPacket as local variable(or said temporary variable)

my program receive a aac audio stream from net,and use ffmpeg to decode the stream,so I must pack the stream data to a AVPacket struct,I use a local variable to do this, the code like below:
bool OnRecvAACStream(const char * audioDataPtr,int audioDataSize,int64_t tBeg,int64_t tDura)
{
AVPacket pkt_tmp; // local varible
av_init_packet(&pkt_tmp);
pkt_tmp.data = audioDataPtr;
pkt_tmp.size = audioDataSize;
pkt_tmp.pts = tBeg;
pkt_tmp.duration = tDura;
if (avcodec_send_packet(m_codec_ctx, &pkt_tmp) < 0)
{
assert(false);
return false;
}
while (avcodec_receive_frame(m_codec_ctx, m_dec_frame) == 0)
{
// read out dec audio data
...
}
retur true;
}
I just use av_init_packet() to init the local varible, av_packet_unref() and av_packet_free() are not called,so is it valid? is there any memory leak problem?

av_init_packet doesn't allocate anything. Just sets default values for AVPacket structure and doesn't touch data and size field. You should keep track of .data part, when anything allocated, it has to be freed before overwritten.
According to the code above, you'll probably wanna use av_frame_free and/or av_frame_unref, before you done with the current frame and next frame kicks in.

file size and buffer overshoot

I have a function which opens a file from an SD card, uses the file size to set the size of a buffer, writes a block of information to that buffer, then does something with that information, as shown in this code:
char filename = "filename.txt";
uint16_t duration;
uint16_t pixel;
int q = 0;
int w = 0;
bool largefile;
File f;
int readuntil;
long large_buffer;
f = SD.open(filename);
if(f.size() > 3072) {
w = 3072;
} else {
w = f.size();
}
uint8_t buffer[w];
while(f.available()) {
f.read(buffer, sizeof(buffer));
while(q < sizeof(buffer)) {
doStuffWithInformation(buffer[q++]);
}
q=0;
}
f.close;
This works great with smaller file sizes, but anything over the hard limit buffer size of 3072 (which I arrived at empirically, its just the amount of memory that can be safely committed to this function), runs into a problem. Larger files read fine until they hit the last loop of while(f.available()), where they read the end of the file, but then continue reading the buffer, the tail end of which is filled with data from the last loop, that wasn't overwritten by the latest f.read(). How can I make sure that the last loop of the while(f.available()) function only works with the information that was written to the buffer during the current loop? My only idea right now is to solve for factors of the file size, and set the buffer size as the largest factor less than 3072, but this seems intensive to run every time this function is called. Is there an elegant solution staring me in the face?

Your program is not behaving correctly because f.read() is not guaranteed to read the whole buffer. Moreover, it is bound to happen when you read the last chunk of the file, unless the file size is a factor of buffer size (3072 in your case).
While Arduino specification (https://www.arduino.cc/en/Reference/FileRead) doesn't say so, SD.read function returns the number of bytes read. See code of the library here: https://github.com/arduino-libraries/SD/blob/master/src/utility/SdFile.cpp, int16_t SdFile::read(void* buf, uint16_t nbyte)
Knowing that, you should change your loop as following (while also rewriting it as a for loop for better readability and removing q definition above):
while(f.available()) {
uint16_t sz = f.read(buffer, sizeof(buffer));
for (uint16_t q = 0; q < sz; ++q) {
doStuffWithInformation(buffer[q]);
}
}
On a side note, now, when you have this logic in place, it would make sense for you to do away with variable length array and use a fixed buffer of size 512 - the standard sector size on the SD card. Most likely, it will yield the same performance in regards to read, and slightly better performance in regards to sizeof, which will becomes a compile-time constant rather than a run-time calculation. This also makes your program simpler. This makes for following code:
f = SD.open(filename);
...
uint8_t buffer[512];

What should I do when write returns smaller size?

I am writing a wrapper around generic file operations and do not know how to handle the case when write returns a smaller size written then provided.
The man page for write says:
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying physical medium, or the RLIMIT_FSIZE resource limit is encountered (see setrlimit(2)), or the call was interrupted by a signal handler after having written less than count bytes. (See also pipe(7).)
From my understanding of the above, it's a mixture of errors (medium full) and incitation to come back (interrupted call). If my file descriptors are all non-blocking, I should not get the interrupt case and then the only reason would be an error. Am I right ?
Code example:
int size_written = write(fd, str, count);
if (size_written == -1) {
if (errno == EAGAIN) {
// poll on fd and come back later
} else {
// throw an error
}
} else if (size_written < count) {
// ***************
// what should I do here ?
// throw an error ?
// ***************
}

You need to use the raw I/O functions in a loop:
ssize_t todo = count;
for (ssize_t n; todo > 0; )
{
n = write(fd, str, todo);
if (n == -1 && errno != EINTR)
{
// error
break;
}
str += n;
todo -= n;
}
if (todo != 0) { /* error */ }
The special condition concerning EINTR allows the write call to be interrupted by a signal without causing the entire operation to fail. Otherwise, we expect to be able to write all data eventually.
If you can't finish writing all data because your file descriptor is non-blocking and cannot accept any data at the moment, you have to store the remaining data and try again later when the file descriptor has signalled that it's ready for writing again.

Designing a fast "rolling window" file reader

I'm writing an algorithm in C++ that scans a file with a "sliding window," meaning it will scan bytes 0 to n, do something, then scan bytes 1 to n+1, do something, and so forth, until the end is reached.
My first algorithm was to read the first n bytes, do something, dump one byte, read a new byte, and repeat. This was very slow because to "ReadFile" from HDD one byte at a time was inefficient. (About 100kB/s)
My second algorithm involves reading a chunk of the file (perhaps n*1000 bytes, meaning the whole file if it's not too large) into a buffer and reading individual bytes off the buffer. Now I get about 10MB/s (decent SSD + Core i5, 1.6GHz laptop).
My question: Do you have suggestions for even faster models?
edit: My big buffer (relative to the window size) is implemented as follows:
- for a rolling window of 5kB, the buffer is initialized to 5MB
- read the first 5MB of the file into the buffer
- the window pointer starts at the beginning of the buffer
- upon shifting, the window pointer is incremented
- when the window pointer nears the end of the 5MB buffer, (say at 4.99MB), copy the remaining 0.01MB to the beginning of the buffer, reset the window pointer to the beginning, and read an additional 4.99MB into the buffer.
- repeat
edit 2 - the actual implementation (removed)
Thank you all for many insightful response. It was hard to select a "best answer"; they were all excellent and helped with my coding.

I use a sliding window in one of my apps (actually, several layers of sliding windows working on top of each other, but that is outside the scope of this discussion). The window uses a memory-mapped file view via CreateFileMapping() and MapViewOfFile(), then I have an an abstraction layer on top of that. I ask the abstraction layer for any range of bytes I need, and it ensures that the file mapping and file view are adjusted accordingly so those bytes are in memory. Every time a new range of bytes is requested, the file view is adjusted only if needed.
The file view is positioned and sized on page boundaries that are even multiples of the system granularity as reported by GetSystemInfo(). Just because a scan reaches the end of a given byte range does not necessarily mean it has reached the end of a page boundary yet, so the next scan may not need to alter the file view at all, the next bytes are already in memory. If the first requested byte of a range exceeds the right-hand boundary of a mapped page, the left edge of the file view is adjusted to the left-hand boundary of the requested page and any pages to the left are unmapped. If the last requested byte in the range exceeds the right-hand boundary of the right-most mapped page, a new page is mapped and added to the file view.
It sounds more complex than it really is to implement once you get into the coding of it:
Creating a View Within a File
It sounds like you are scanning bytes in fixed-sized blocks, so this approach is very fast and very efficient for that. Based on this technique, I can sequentially scan multi-GIGBYTE files from start to end fairly quickly, usually a minute or less on my slowest machine. If your files are smaller then the system granularity, or even just a few megabytes, you will hardly notice any time elapsed at all (unless your scans themselves are slow).
Update: here is a simplified variation of what I use:
class FileView
{
private:
DWORD m_AllocGran;
DWORD m_PageSize;
HANDLE m_File;
unsigned __int64 m_FileSize;
HANDLE m_Map;
unsigned __int64 m_MapSize;
LPBYTE m_View;
unsigned __int64 m_ViewOffset;
DWORD m_ViewSize;
void CloseMap()
{
CloseView();
if (m_Map != NULL)
{
CloseHandle(m_Map);
m_Map = NULL;
}
m_MapSize = 0;
}
void CloseView()
{
if (m_View != NULL)
{
UnmapViewOfFile(m_View);
m_View = NULL;
}
m_ViewOffset = 0;
m_ViewSize = 0;
}
bool EnsureMap(unsigned __int64 Size)
{
// do not exceed EOF or else the file on disk will grow!
Size = min(Size, m_FileSize);
if ((m_Map == NULL) ||
(m_MapSize != Size))
{
// a new map is needed...
CloseMap();
ULARGE_INTEGER ul;
ul.QuadPart = Size;
m_Map = CreateFileMapping(m_File, NULL, PAGE_READONLY, ul.HighPart, ul.LowPart, NULL);
if (m_Map == NULL)
return false;
m_MapSize = Size;
}
return true;
}
bool EnsureView(unsigned __int64 Offset, DWORD Size)
{
if ((m_View == NULL) ||
(Offset < m_ViewOffset) ||
((Offset + Size) > (m_ViewOffset + m_ViewSize)))
{
// the requested range is not already in view...
// round down the offset to the nearest allocation boundary
unsigned __int64 ulNewOffset = ((Offset / m_AllocGran) * m_AllocGran);
// round up the size to the next page boundary
DWORD dwNewSize = ((((Offset - ulNewOffset) + Size) + (m_PageSize-1)) & ~(m_PageSize-1));
// if the new view will exceed EOF, truncate it
unsigned __int64 ulOffsetInFile = (ulNewOffset + dwNewSize);
if (ulOffsetInFile > m_FileSize)
dwNewViewSize -= (ulOffsetInFile - m_FileSize);
if ((m_View == NULL) ||
(m_ViewOffset != ulNewOffset) ||
(m_ViewSize != ulNewSize))
{
// a new view is needed...
CloseView();
// make sure the memory map is large enough to contain the entire view
if (!EnsureMap(ulNewOffset + dwNewSize))
return false;
ULARGE_INTEGER ul;
ul.QuadPart = ulNewOffset;
m_View = (LPBYTE) MapViewOfFile(m_Map, FILE_MAP_READ, ul.HighPart, ul.LowPart, dwNewSize);
if (m_View == NULL)
return false;
m_ViewOffset = ulNewOffset;
m_ViewSize = dwNewSize;
}
}
return true;
}
public:
FileView() :
m_AllocGran(0),
m_PageSize(0),
m_File(INVALID_HANDLE_VALUE),
m_FileSize(0),
m_Map(NULL),
m_MapSize(0),
m_View(NULL),
m_ViewOffset(0),
m_ViewSize(0)
{
// map views need to be positioned on even multiples
// of the system allocation granularity. let's size
// them on even multiples of the system page size...
SYSTEM_INFO si = {0};
if (GetSystemInfo(&si))
{
m_AllocGran = si.dwAllocationGranularity;
m_PageSize = si.dwPageSize;
}
}
~FileView()
{
CloseFile();
}
bool OpenFile(LPTSTR FileName)
{
CloseFile();
if ((m_AllocGran == 0) || (m_PageSize == 0))
return false;
HANDLE hFile = CreateFile(FileName, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, 0, NULL);
if (hFile == INVALID_HANDLE_VALUE)
return false;
ULARGE_INTEGER ul;
ul.LowPart = GetFileSize(hFile, &ul.HighPart);
if ((ul.LowPart == INVALID_FILE_SIZE) && (GetLastError() != 0))
{
CloseHandle(hFile);
return false;
}
m_File = hFile;
m_FileSize = ul.QuadPart;
return true;
}
void CloseFile()
{
CloseMap();
if (m_File != INVALID_HANDLE_VALUE)
{
CloseHandle(m_File);
m_File = INVALID_HANDLE_VALUE;
}
m_FileSize = 0;
}
bool AccessBytes(unsigned __int64 Offset, DWORD Size, LPBYTE *Bytes, DWORD *Available)
{
if (Bytes) *Bytes = NULL;
if (Available) *Available = 0;
if ((m_FileSize != 0) && (offset < m_FileSize))
{
// make sure the requested range is in view
if (!EnsureView(Offset, Size))
return false;
// near EOF, the available bytes may be less than requested
DWORD dwOffsetInView = (Offset - m_ViewOffset);
if (Bytes) *Bytes = &m_View[dwOffsetInView];
if (Available) *Available = min(m_ViewSize - dwOffsetInView, Size);
}
return true;
}
};
.
FileView fv;
if (fv.OpenFile(TEXT("C:\\path\\file.ext")))
{
LPBYTE data;
DWORD len;
unsigned __int64 offset = 0, filesize = fv.FileSize();
while (offset < filesize)
{
if (!fv.AccessBytes(offset, some size here, &data, &len))
break; // error
if (len == 0)
break; // unexpected EOF
// use data up to len bytes as needed...
offset += len;
}
fv.CloseFile();
}
This code is designed to allow random jumping anywhere in the file at any data size. Since you are reading bytes sequentially, some of the logic can be simplified as needed.

Your new algorithm only pays 0.1% of the I/O inefficiencies... not worth worrying about.
To get further throughput improvement, you should take a closer look at the "do something" step. See whether you can reuse part of the result from an overlapping window. Check cache behavior. Check if there's a better algorithm for the same computation.

You have the basic I/O technique down. The easiest improvement you can make now is to pick a good buffer size. With some experimentation, you'll find that read performance increases quickly with buffer size until you hit about 16k, then performance begins to level out.
Your next task is probably to profile your code, and see where it is spending its time. When dealing with performance, it is always best to measure rather than guess. You don't mention what OS you're using, so I won't make any profiler recommendations.
You can also try to reduce the amount of copying/moving of data between your buffer and your workspace. Less copying is generally better. If you can process your data in-place instead of moving it to a new location, that's a win. (I see from your edits you're already doing this.)
Finally, if you're processing many gigabytes of archived information then you should consider keeping your data compressed. It will come as a surprise to many people that it is faster to read compressed data and then decompress it than it is to just read decompressed data. My favorite algorithm for this purpose is LZO which doesn't compress as well as some other algorithms, but decompresses impressively fast. This kind of setup is only worth the engineering effort if:
Your job is I/O bound.
You are reading many G of data.
You're running the program frequently, so it saves you a lot of time to make it run
faster.

c++ stl: a good way to parse a sensor response

Attention please:
I already implemented this stuff, just not in any way generic or elegant. This question is motivated by my wanting to learn more tricks with the stl, not the problem itself.
This I think is clear in the way I stated that I already solved the problem, but many people have answered in their best intentions with solutions to the problem, not answers to the question "how to solve this the stl way". I am really sorry if I phrased this question in a confusing way. I hate to waste people's time.
Ok, here it comes:
I get a string full of encoded Data.
It comes in:
N >> 64 bytes
every 3 byte get decoded into an int value
after at most 64 byte (yes,not divisible by 3!) comes a byte as checksum
followed by a line feed.
and so it goes on.
It ends when 2 successive linefeeds are found.
It looks like a nice or at least ok data format, but parsing it elegantly
the stl way is a real bit**.
I have done the thing "manually".
But I would be interested if there is an elegant way with the stl- or maybe boost- magic that doesn't incorporate copying the thing.
Clarification:
It gets really big sometimes. The N >> 64byte was more like a N >>> 64 byte ;-)
UPDATE
Ok, the N>64 bytes seems to be confusing. It is not important.
The sensor takes M measurements as integers. Encodes each of them into 3 bytes. and sends them one after another
when the sensor has sent 64byte of data, it inserts a checksum over the 64 byte and an LF. It doesn't care if one of the encoded integers is "broken up" by that. It just continues in the next line.(That has only the effect to make the data nicely human readable but kindof nasty to parse elegantly.)
if it has finished sending data it inserts a checksum-byte and LFLF
So one data chunk can look like this, for N=129=43x3:
|<--64byte-data-->|1byte checksum|LF
|<--64byte-data-->|1byte checksum|LF
|<--1byte-data-->|1byte checksum|LF
LF
When I have M=22 measurements, this means I have N=66 bytes of data.
After 64 byte it inserts the checksum and LF and continues.
This way it breaks up my last measurement
which is encoded in byte 64, 65 and 66. It now looks like this: 64, checksum, LF, 65, 66.
Since a multiple of 3 divided by 64 carries a residue 2 out of 3 times, and everytime
another one, it is nasty to parse.
I had 2 solutions:
check checksum, concatenate data to one string that only has data bytes, decode.
run through with iterators and one nasty if construct to avoid copying.
I just thought there might be someting better. I mused about std::transform, but it wouldn't work because of the 3 byte is one int thing.

As much as I like STL, I don't think there's anything wrong with doing things manually, especially if the problem does not really fall into the cases the STL has been made for. Then again, I'm not sure why you ask. Maybe you need an STL input iterator that (checks and) discards the check sums and LF characters and emits the integers?
I assume the encoding is such that LF can only appear at those places, i.e., some kind of Base-64 or similar?

It seems to me that something as simple as the following should solve the problem:
string line;
while( getline( input, line ) && line != "" ) {
int val = atoi( line.substr(0, 3 ).c_str() );
string data = line.substr( 3, line.size() - 4 );
char csum = line[ line.size() - 1 ];
// process val, data and csum
}
In a real implementation you would want to add error checking, but the basic logic should remain the same.

As others have said, there is no silver bullet in stl/boost to elegantly solve your problem. If you want to parse your chunk directly via pointer arithmetic, perhaps you can take inspiration from std::iostream and hide the messy pointer arithmetic in a custom stream class. Here's a half-arsed solution I came up with:
#include <cctype>
#include <iostream>
#include <vector>
#include <boost/lexical_cast.hpp>
class Stream
{
public:
enum StateFlags
{
goodbit = 0,
eofbit = 1 << 0, // End of input packet
failbit = 1 << 1 // Corrupt packet
};
Stream() : state_(failbit), csum_(0), pos_(0), end_(0) {}
Stream(char* begin, char* end) {open(begin, end);}
void open(char* begin, char* end)
{state_=goodbit; csum_=0; pos_=begin, end_=end;}
StateFlags rdstate() const {return static_cast<StateFlags>(state_);}
bool good() const {return state_ == goodbit;}
bool fail() const {return (state_ & failbit) != 0;}
bool eof() const {return (state_ & eofbit) != 0;}
Stream& read(int& measurement)
{
measurement = readDigit() * 100;
measurement += readDigit() * 10;
measurement += readDigit();
return *this;
}
private:
int readDigit()
{
int digit = 0;
// Check if we are at end of packet
if (pos_ == end_) {state_ |= eofbit; return 0;}
/* We should be at least csum|lf|lf away from end, and we are
not expecting csum or lf here. */
if (pos_+3 >= end_ || pos_[0] == '\n' || pos_[1] == '\n')
{
state_ |= failbit;
return 0;
}
if (!getDigit(digit)) {return 0;}
csum_ = (csum_ + digit) % 10;
++pos_;
// If we are at checksum, check and consume it, along with linefeed
if (pos_[1] == '\n')
{
int checksum = 0;
if (!getDigit(checksum) || (checksum != csum_)) {state_ |= failbit;}
csum_ = 0;
pos_ += 2;
// If there is a second linefeed, we are at end of packet
if (*pos_ == '\n') {pos_ = end_;}
}
return digit;
}
bool getDigit(int& digit)
{
bool success = std::isdigit(*pos_);
if (success)
digit = boost::lexical_cast<int>(*pos_);
else
state_ |= failbit;
return success;
}
int csum_;
unsigned int state_;
char* pos_;
char* end_;
};
int main()
{
// Use (8-byte + csum + LF) fragments for this example
char data[] = "\
001002003\n\
300400502\n\
060070081\n\n";
std::vector<int> measurements;
Stream s(data, data + sizeof(data));
int meas = 0;
while (s.read(meas).good())
{
measurements.push_back(meas);
std::cout << meas << " ";
}
return 0;
}
Maybe you'll want to add extra StateFlags to determine if failure is due to checksum error or framing error. Hope this helps.

You should think of your communication protocol as being layered. Treat
|<--64byte-data-->|1byte checksum|LF
as fragments to be reassembled into larger packets of contiguous data. Once the larger packet is reconstituted, it is easier to parse its data contiguously (you don't have to deal with measurements being split up across fragments). Many existing network protocols (such as UDP/IP) does this sort of reassembly of fragments into packets.
It's possible to read the fragments directly into their proper "slot" in the packet buffer. Since your fragments have footers instead of headers, and there is no out-of-order arrival of your fragments, this should be fairly easy to code (compared to copyless IP reassembly algorithms). Once you receive an "empty" fragment (the duplicate LF), this marks the end of the packet.
Here is some sample code to illustrate the idea:
#include <vector>
#include <cassert>
class Reassembler
{
public:
// Constructs reassembler with given packet buffer capacity
Reassembler(int capacity) : buf_(capacity) {reset();}
// Returns bytes remaining in packet buffer
int remaining() const {return buf_.end() - pos_;}
// Returns a pointer to where the next fragment should be read
char* back() {return &*pos_;}
// Advances the packet's position cursor for the next fragment
void push(int size) {pos_ += size; if (size == 0) complete_ = true;}
// Returns true if an empty fragment was pushed to indicate end of packet
bool isComplete() const {return complete_;}
// Resets the reassembler so it can process a new packet
void reset() {pos_ = buf_.begin(); complete_ = false;}
// Returns a pointer to the accumulated packet data
char* data() {return &buf_[0];}
// Returns the size in bytes of the accumulated packet data
int size() const {return pos_ - buf_.begin();}
private:
std::vector<char> buf_;
std::vector<char>::iterator pos_;
bool complete_;
};
int readFragment(char* dest, int maxBytes, char delimiter)
{
// Read next fragment from source and save to dest pointer
// Return number of bytes in fragment, except delimiter character
}
bool verifyChecksum(char* fragPtr, int size)
{
// Returns true if fragment checksum is valid
}
void processPacket(char* data, int size)
{
// Extract measurements which are now stored contiguously in packet
}
int main()
{
const int kChecksumSize = 1;
Reassembler reasm(1000); // Use realistic capacity here
while (true)
{
while (!reasm.isComplete())
{
char* fragDest = reasm.back();
int fragSize = readFragment(fragDest, reasm.remaining(), '\n');
if (fragSize > 1)
assert(verifyChecksum(fragDest, fragSize));
reasm.push(fragSize - kChecksumSize);
}
processPacket(reasm.data(), reasm.size());
reasm.reset();
}
}
The trick will be making an efficient readFragment function that stops at every newline delimiter and stores the incoming data into the given destination buffer pointer. If you tell me how you acquire your sensor data, then I can perhaps give you more ideas.

An elegant solution this isn't. It would be more so by using a "transition matrix", and only reading one character at a time. Not my style. Yet this code has a minimum of redundant data movement, and it seems to do the job. Minimally C++, it really is just a C program. Adding iterators is left as an exercise for the reader. The data stream wasn't completely defined, and there was no defined destination for the converted data. Assumptions noted in comments. Lots of printing should show functionality.
// convert series of 3 ASCII decimal digits to binary
// there is a checksum byte at least once every 64 bytes - it can split a digit series
// if the interval is less than 64 bytes, it must be followd by LF (to identify it)
// if the interval is a full 64 bytes, the checksum may or may not be followed by LF
// checksum restricted to a simple sum modulo 10 to keep ASCII format
// checksum computations are only printed to allowed continuation of demo, and so results can be
// inserted back in data for testing
// there is no verification of the 3 byte sets of digits
// results are just printed, non-zero return indicates error
int readData(void) {
int binValue = 0, digitNdx = 0, sensorCnt = 0, lineCnt = 0;
char oneDigit;
string sensorTxt;
while( getline( cin, sensorTxt ) ) {
int i, restart = 0, checkSum = 0, size = sensorTxt.size()-1;
if(size < 0)
break;
lineCnt++;
if(sensorTxt[0] == '#')
continue;
printf("INPUT: %s\n", &sensorTxt[0]); // gag
while(restart<size) {
for(i=0; i<min(64, size); i++) {
oneDigit = sensorTxt[i+restart] & 0xF;
checkSum += oneDigit;
binValue = binValue*10 + oneDigit;
//printf("%3d-%X ", binValue, sensorTxt[i+restart]);
digitNdx++;
if(digitNdx == 3) {
sensorCnt++;
printf("READING# %d (LINE %d) = %d CKSUM %d\n",
sensorCnt, lineCnt, binValue, checkSum);
digitNdx = 0;
binValue = 0;
}
}
oneDigit = sensorTxt[i+restart] & 0x0F;
char compCheckDigit = (10-(checkSum%10)) % 10;
printf(" CKSUM at sensorCnt %d ", sensorCnt);
if((checkSum+oneDigit) % 10)
printf("ERR got %c exp %c\n", oneDigit|0x30, compCheckDigit|0x30);
else
printf("OK\n");
i++;
restart += i;
}
}
if(digitNdx)
return -2;
else
return 0;
}
The data definition was extended with comments, you you can use the following as is:
# normal 64 byte lines with 3 digit value split across lines
00100200300400500600700800901001101201301401501601701801902002105
22023024025026027028029030031032033034035036037038039040041042046
# short lines, partial values - remove checksum digit to combine short lines
30449
0451
0460479
0480490500510520530540550560570580590600610620630641
# long line with embedded checksums every 64 bytes
001002003004005006007008009010011012013014015016017018019020021052202302402502602702802903003103203303403503603703803904004104204630440450460470480490500510520530540550560570580590600610620630640
# dangling digit at end of file (with OK checksum)
37

Why are you concerned with copying? Is it the time overhead or the space overhead?
It sounds like you are reading all of the unparsed data into a big buffer, and now you want to write code that makes the big buffer of unparsed data look like a slightly smaller buffer of parsed data (the data minus the checksums and linefeeds), to avoid the space overhead involved in copying it into the slightly smaller buffer.
Adding a complicated layer of abstraction isn't going to help with the time overhead unless you only need a small portion of the data. And if that's the case, maybe you could just figure out which small portion you need, and copy that. (Then again, most of the abstraction layer may be already written for you, e.g. the Boost iterator library.)
An easier way to reduce the space overhead is to read the data in smaller chunks (or a line at a time), and parse it as you go. Then you only need to store the parsed version in a big buffer. (This assumes you're reading it from a file / socket / port, rather than being passed a large buffer that is not under your control.)
Another way to reduce the space overhead is to overwrite the data in place as you parse it. You will incur the cost of copying it, but you will only need one buffer. (This assumes that the 64-byte data doesn't grow when you parse it, i.e. it is not compressed.)

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js