Why thrift TBinaryProtocol read recv data more complex than just size + content

Why thrift TBinaryProtocol read recv data more complex than just size + content - c++

Thrift version is 0.8. I'm implementing my own thrift transport layer in client with C++, protocol use Binary, my server is use frame transport and binary protocol, and is no problem for sure. And I get "No more data to read" exception in TTransport.h readAll function. I traced the call link, find in TBinaryProtocol.tcc
template <class Transport_>
uint32_t TBinaryProtocolT<Transport_>::readMessageBegin(std::string& name,
TMessageType& messageType,
int32_t& seqid) {
uint32_t result = 0;
int32_t sz;
result += readI32(sz); **//sz should be the whole return buf len without the first 4 bytes?**
if (sz < 0) {
// Check for correct version number
int32_t version = sz & VERSION_MASK;
if (version != VERSION_1) {
throw TProtocolException(TProtocolException::BAD_VERSION, "Bad version identifier");
}
messageType = (TMessageType)(sz & 0x000000ff);
result += readString(name);
result += readI32(seqid);
} else {
if (this->strict_read_) {
throw TProtocolException(TProtocolException::BAD_VERSION, "No version identifier... old protocol client in strict mode?");
} else {
// Handle pre-versioned input
int8_t type;
result += readStringBody(name, sz);
result += readByte(type); **//No more data to read in buf, so exception here**
messageType = (TMessageType)type;
result += readI32(seqid);
}
}
return result;
}
So my quesiton is: in frame transport, the data struct, should ONLY be size + content(result, seqid, function name....), that's exactly what my server pack. Then my client read the first 4 bytes lenth, and use it to fetch the whole content, is there any other left to read now?
Here is my client code, I believe quite simple.the most import part I have emphasize that.
class CthriftCli
{
......
TMemoryBuffer write_buf_;
TMemoryBuffer read_buf_;
enum CthriftConn::State state_;
uint32_t frameSize_;
};
void CthriftCli::OnConn4SgAgent(const TcpConnectionPtr& conn)
{
if(conn->connected() ){
conn->setTcpNoDelay(true);
wp_tcp_conn_ = boost::weak_ptr<muduo::net::TcpConnection>(conn);
if(unlikely(!(sp_countdown_latch_4_conn_.get()))) {
return 0;
}
sp_countdown_latch_4_conn_->countDown();
}
}
void CthriftCli::OnMsg4SgAgent(const muduo::net::TcpConnectionPtr& conn,
muduo::net::Buffer* buffer,
muduo::Timestamp receiveTime)
{
bool more = true;
while (more)
{
if (state_ == CthriftConn::kExpectFrameSize)
{
if (buffer->readableBytes() >= 4)
{
frameSize_ = static_cast<uint32_t>(buffer->peekInt32());
state_ = CthriftConn::kExpectFrame;
}
else
{
more = false;
}
}
else if (state_ == CthriftConn::kExpectFrame)
{
if (buffer->readableBytes() >= frameSize_)
{
uint8_t* buf = reinterpret_cast<uint8_t*>((const_cast<char*>(buffer->peek())));
read_buf_.resetBuffer(buf, sizeof(int32_t) + frameSize_, TMemoryBuffer::COPY); **// all the return buf, include first size bytes**
if(unlikely(!(sp_countdown_latch_.get()))){
return;
}
sp_countdown_latch_->countDown();
buffer->retrieve(sizeof(int32_t) + frameSize_);
state_ = CthriftConn::kExpectFrameSize;
}
else
{
more = false;
}
}
}
}
uint32_t CthriftCli::read(uint8_t* buf, uint32_t len) {
if (read_buf_.available_read() == 0) {
if(unlikely(!(sp_countdown_latch_.get()))){
return 0;
}
sp_countdown_latch_->wait();
}
return read_buf_.read(buf, len);
}
void CthriftCli::readEnd(void) {
read_buf_.resetBuffer();
}
void CthriftCli::write(const uint8_t* buf, uint32_t len) {
return write_buf_.write(buf, len);
}
uint32_t CthriftCli::writeEnd(void)
{
uint8_t* buf;
uint32_t size;
write_buf_.getBuffer(&buf, &size);
if(unlikely(!(sp_countdown_latch_4_conn_.get()))) {
return 0;
}
sp_countdown_latch_4_conn_->wait();
TcpConnectionPtr sp_tcp_conn(wp_tcp_conn_.lock());
if (sp_tcp_conn && sp_tcp_conn->connected()) {
muduo::net::Buffer send_buf;
send_buf.appendInt32(size);
send_buf.append(buf, size);
sp_tcp_conn->send(&send_buf);
write_buf_.resetBuffer(true);
} else {
#ifdef MUDUO_LOG
MUDUO_LOG_ERROR << "conn error, NOT send";
#endif
}
return size;
}
So please give me some hints about this?

You seem to have mixed concepts of 'transport' and 'protocol'.
Binary Protocol describes how data should be encoded (protocol layer).
Framed Transport describes how encoded data should be delivered (forwarded by message length) - transport layer.
Important part - Binary Protocol is not (and should not) be aware of any transport issues. So if you add frame size while encoding on transport level, you should also interpret incoming size before passing read bytes to Binary Protocol for decoding. You can (for example) use it to read all required bytes at once etc.
After quick looking trough you code: try reading 4 bytes of frame size instead of peeking it. Those bytes should not be visible outside transport layer.

Related

Xcode app for macOS. This is how I setup to get audio from usb mic input. Worked a year ago, now doesn't. Why

Here is my audio init code. My app responds when queue buffers are ready, but all data in buffer is zero. Checking sound in system preferences shows that USB Audio CODEC in sound input dialog is active. AudioInit() is called right after app launches.
{
#pragma mark user data struct
typedef struct MyRecorder
{
AudioFileID recordFile;
SInt64 recordPacket;
Float32 *pSampledData;
MorseDecode *pMorseDecoder;
} MyRecorder;
#pragma mark utility functions
void CheckError(OSStatus error, const char *operation)
{
if(error == noErr) return;
char errorString[20];
// see if it appears to be a 4 char code
*(UInt32*)(errorString + 1) = CFSwapInt32HostToBig(error);
if (isprint(errorString[1]) && isprint(errorString[2]) &&
isprint(errorString[3]) && isprint(errorString[4]))
{
errorString[0] = errorString[5] = '\'';
errorString[6] = '\0';
}
else
{
sprintf(errorString, "%d", (int)error);
}
fprintf(stderr, "Error: %s (%s)\n", operation, errorString);
}
OSStatus MyGetDefaultInputDeviceSampleRate(Float64 *outSampleRate)
{
OSStatus error;
AudioDeviceID deviceID = 0;
AudioObjectPropertyAddress propertyAddress;
UInt32 propertySize;
propertyAddress.mSelector = kAudioHardwarePropertyDefaultInputDevice;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(AudioDeviceID);
error = AudioObjectGetPropertyData(kAudioObjectSystemObject,
&propertyAddress,
0,
NULL,
&propertySize,
&deviceID);
if(error)
return error;
propertyAddress.mSelector = kAudioDevicePropertyNominalSampleRate;
propertyAddress.mScope = kAudioObjectPropertyScopeGlobal;
propertyAddress.mElement = 0;
propertySize = sizeof(Float64);
error = AudioObjectGetPropertyData(deviceID,
&propertyAddress,
0,
NULL,
&propertySize,
outSampleRate);
return error;
}
static int MyComputeRecordBufferSize(const AudioStreamBasicDescription *format,
AudioQueueRef queue,
float seconds)
{
int packets, frames, bytes;
frames = (int)ceil(seconds * format->mSampleRate);
if(format->mBytesPerFrame > 0)
{
bytes = frames * format->mBytesPerFrame;
}
else
{
UInt32 maxPacketSize;
if(format->mBytesPerPacket > 0)
{
// constant packet size
maxPacketSize = format->mBytesPerPacket;
}
else
{
// get the largest single packet size possible
UInt32 propertySize = sizeof(maxPacketSize);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterPropertyMaximumOutputPacketSize,
&maxPacketSize,
&propertySize),
"Couldn't get queues max output packet size");
}
if(format->mFramesPerPacket > 0)
packets = frames / format->mFramesPerPacket;
else
// worst case scenario: 1 frame in a packet
packets = frames;
// sanity check
if(packets == 0)
packets = 1;
bytes = packets * maxPacketSize;
}
return bytes;
}
extern void bridgeToMainThread(MorseDecode *pDecode);
static int callBacks = 0;
// ---------------------------------------------
static void MyAQInputCallback(void *inUserData,
AudioQueueRef inQueue,
AudioQueueBufferRef inBuffer,
const AudioTimeStamp *inStartTime,
UInt32 inNumPackets,
const AudioStreamPacketDescription *inPacketDesc)
{
MyRecorder *recorder = (MyRecorder*)inUserData;
Float32 *pAudioData = (Float32*)(inBuffer->mAudioData);
recorder->pMorseDecoder->pBuffer = pAudioData;
recorder->pMorseDecoder->bufferSize = inNumPackets;
bridgeToMainThread(recorder->pMorseDecoder);
CheckError(AudioQueueEnqueueBuffer(inQueue,
inBuffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
printf("packets = %ld, bytes = %ld\n",(long)inNumPackets,(long)inBuffer->mAudioDataByteSize);
callBacks++;
//printf("\ncallBacks = %d\n",callBacks);
//if(callBacks == 0)
//audioStop();
}
static AudioQueueRef queue = {0};
static MyRecorder recorder = {0};
static AudioStreamBasicDescription recordFormat;
void audioInit()
{
// set up format
memset(&recordFormat,0,sizeof(recordFormat));
recordFormat.mFormatID = kAudioFormatLinearPCM;
recordFormat.mChannelsPerFrame = 2;
recordFormat.mBitsPerChannel = 32;
recordFormat.mBytesPerPacket = recordFormat.mBytesPerFrame = recordFormat.mChannelsPerFrame * sizeof(Float32);
recordFormat.mFramesPerPacket = 1;
//recordFormat.mFormatFlags = kAudioFormatFlagsCanonical;
recordFormat.mFormatFlags = kAudioFormatFlagsNativeFloatPacked;
MyGetDefaultInputDeviceSampleRate(&recordFormat.mSampleRate);
UInt32 propSize = sizeof(recordFormat);
CheckError(AudioFormatGetProperty(kAudioFormatProperty_FormatInfo,
0,
NULL,
&propSize,
&recordFormat),
"AudioFormatProperty failed");
recorder.pMorseDecoder = MorseDecode::pInstance();
recorder.pMorseDecoder->m_sampleRate = recordFormat.mSampleRate;
// recorder.pMorseDecoder->setCircularBuffer();
//set up queue
CheckError(AudioQueueNewInput(&recordFormat,
MyAQInputCallback,
&recorder,
NULL,
kCFRunLoopCommonModes,
0,
&queue),
"AudioQueueNewInput failed");
UInt32 size = sizeof(recordFormat);
CheckError(AudioQueueGetProperty(queue,
kAudioConverterCurrentOutputStreamDescription,
&recordFormat,
&size), "Couldn't get queue's format");
// set up buffers and enqueue
const int kNumberRecordBuffers = 3;
int bufferByteSize = MyComputeRecordBufferSize(&recordFormat, queue, AUDIO_BUFFER_DURATION);
for(int bufferIndex = 0; bufferIndex < kNumberRecordBuffers; bufferIndex++)
{
AudioQueueBufferRef buffer;
CheckError(AudioQueueAllocateBuffer(queue,
bufferByteSize,
&buffer),
"AudioQueueAllocateBuffer failed");
CheckError(AudioQueueEnqueueBuffer(queue,
buffer,
0,
NULL),
"AudioQueueEnqueueBuffer failed");
}
}
void audioRun()
{
CheckError(AudioQueueStart(queue, NULL), "AudioQueueStart failed");
}
void audioStop()
{
CheckError(AudioQueuePause(queue), "AudioQueuePause failed");
}
}

This sounds like the new macOS 'microphone privacy' setting, which, if set to 'no access' for your app, will cause precisely this behaviour. So:
Open the System Preferences pane.
Click on 'Security and Privacy'.
Select the Privacy tab.
Click on 'Microphone' in the left-hand pane.
Locate your app in the right-hand pane and tick the checkbox next to it.
Then restart your app and test it.
Tedious, no?
Edit: As stated in the comments, you can't directly request microphone access, but you can detect whether it has been granted to your app or not by calling [AVCaptureDevice authorizationStatusForMediaType: AVMediaTypeAudio].

c++ Protocol buffer sending over network [duplicate]

I'm trying to read / write multiple Protocol Buffers messages from files, in both C++ and Java. Google suggests writing length prefixes before the messages, but there's no way to do that by default (that I could see).
However, the Java API in version 2.1.0 received a set of "Delimited" I/O functions which apparently do that job:
parseDelimitedFrom
mergeDelimitedFrom
writeDelimitedTo
Are there C++ equivalents? And if not, what's the wire format for the size prefixes the Java API attaches, so I can parse those messages in C++?
Update:
These now exist in google/protobuf/util/delimited_message_util.h as of v3.3.0.

I'm a bit late to the party here, but the below implementations include some optimizations missing from the other answers and will not fail after 64MB of input (though it still enforces the 64MB limit on each individual message, just not on the whole stream).
(I am the author of the C++ and Java protobuf libraries, but I no longer work for Google. Sorry that this code never made it into the official lib. This is what it would look like if it had.)
bool writeDelimitedTo(
const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput) {
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL) {
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
} else {
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError()) return false;
}
return true;
}
bool readDelimitedFrom(
google::protobuf::io::ZeroCopyInputStream* rawInput,
google::protobuf::MessageLite* message) {
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}

Okay, so I haven't been able to find top-level C++ functions implementing what I need, but some spelunking through the Java API reference turned up the following, inside the MessageLite interface:
void writeDelimitedTo(OutputStream output)
/* Like writeTo(OutputStream), but writes the size of
the message as a varint before writing the data. */
So the Java size prefix is a (Protocol Buffers) varint!
Armed with that information, I went digging through the C++ API and found the CodedStream header, which has these:
bool CodedInputStream::ReadVarint32(uint32 * value)
void CodedOutputStream::WriteVarint32(uint32 value)
Using those, I should be able to roll my own C++ functions that do the job.
They should really add this to the main Message API though; it's missing functionality considering Java has it, and so does Marc Gravell's excellent protobuf-net C# port (via SerializeWithLengthPrefix and DeserializeWithLengthPrefix).

I solved the same problem using CodedOutputStream/ArrayOutputStream to write the message (with the size) and CodedInputStream/ArrayInputStream to read the message (with the size).
For example, the following pseudo-code writes the message size following by the message:
const unsigned bufLength = 256;
unsigned char buffer[bufLength];
Message protoMessage;
google::protobuf::io::ArrayOutputStream arrayOutput(buffer, bufLength);
google::protobuf::io::CodedOutputStream codedOutput(&arrayOutput);
codedOutput.WriteLittleEndian32(protoMessage.ByteSize());
protoMessage.SerializeToCodedStream(&codedOutput);
When writing you should also check that your buffer is large enough to fit the message (including the size). And when reading, you should check that your buffer contains a whole message (including the size).
It definitely would be handy if they added convenience methods to C++ API similar to those provided by the Java API.

IsteamInputStream is very fragile to eofs and other errors that easily occurs when used together with std::istream. After this the protobuf streams are permamently damaged and any already used buffer data is destroyed. There are proper support for reading from traditional streams in protobuf.
Implement google::protobuf::io::CopyingInputStream and use that together with CopyingInputStreamAdapter. Do the same for the output variants.
In practice a parsing call ends up in google::protobuf::io::CopyingInputStream::Read(void* buffer, int size) where a buffer is given. The only thing left to do is read into it somehow.
Here's an example for use with Asio synchronized streams (SyncReadStream/SyncWriteStream):
#include <google/protobuf/io/zero_copy_stream_impl_lite.h>
using namespace google::protobuf::io;
template <typename SyncReadStream>
class AsioInputStream : public CopyingInputStream {
public:
AsioInputStream(SyncReadStream& sock);
int Read(void* buffer, int size);
private:
SyncReadStream& m_Socket;
};
template <typename SyncReadStream>
AsioInputStream<SyncReadStream>::AsioInputStream(SyncReadStream& sock) :
m_Socket(sock) {}
template <typename SyncReadStream>
int
AsioInputStream<SyncReadStream>::Read(void* buffer, int size)
{
std::size_t bytes_read;
boost::system::error_code ec;
bytes_read = m_Socket.read_some(boost::asio::buffer(buffer, size), ec);
if(!ec) {
return bytes_read;
} else if (ec == boost::asio::error::eof) {
return 0;
} else {
return -1;
}
}
template <typename SyncWriteStream>
class AsioOutputStream : public CopyingOutputStream {
public:
AsioOutputStream(SyncWriteStream& sock);
bool Write(const void* buffer, int size);
private:
SyncWriteStream& m_Socket;
};
template <typename SyncWriteStream>
AsioOutputStream<SyncWriteStream>::AsioOutputStream(SyncWriteStream& sock) :
m_Socket(sock) {}
template <typename SyncWriteStream>
bool
AsioOutputStream<SyncWriteStream>::Write(const void* buffer, int size)
{
boost::system::error_code ec;
m_Socket.write_some(boost::asio::buffer(buffer, size), ec);
return !ec;
}
Usage:
AsioInputStream<boost::asio::ip::tcp::socket> ais(m_Socket); // Where m_Socket is a instance of boost::asio::ip::tcp::socket
CopyingInputStreamAdaptor cis_adp(&ais);
CodedInputStream cis(&cis_adp);
Message protoMessage;
uint32_t msg_size;
/* Read message size */
if(!cis.ReadVarint32(&msg_size)) {
// Handle error
}
/* Make sure not to read beyond limit of message */
CodedInputStream::Limit msg_limit = cis.PushLimit(msg_size);
if(!msg.ParseFromCodedStream(&cis)) {
// Handle error
}
/* Remove limit */
cis.PopLimit(msg_limit);

Here you go:
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/io/coded_stream.h>
using namespace google::protobuf::io;
class FASWriter
{
std::ofstream mFs;
OstreamOutputStream *_OstreamOutputStream;
CodedOutputStream *_CodedOutputStream;
public:
FASWriter(const std::string &file) : mFs(file,std::ios::out | std::ios::binary)
{
assert(mFs.good());
_OstreamOutputStream = new OstreamOutputStream(&mFs);
_CodedOutputStream = new CodedOutputStream(_OstreamOutputStream);
}
inline void operator()(const ::google::protobuf::Message &msg)
{
_CodedOutputStream->WriteVarint32(msg.ByteSize());
if ( !msg.SerializeToCodedStream(_CodedOutputStream) )
std::cout << "SerializeToCodedStream error " << std::endl;
}
~FASWriter()
{
delete _CodedOutputStream;
delete _OstreamOutputStream;
mFs.close();
}
};
class FASReader
{
std::ifstream mFs;
IstreamInputStream *_IstreamInputStream;
CodedInputStream *_CodedInputStream;
public:
FASReader(const std::string &file), mFs(file,std::ios::in | std::ios::binary)
{
assert(mFs.good());
_IstreamInputStream = new IstreamInputStream(&mFs);
_CodedInputStream = new CodedInputStream(_IstreamInputStream);
}
template<class T>
bool ReadNext()
{
T msg;
unsigned __int32 size;
bool ret;
if ( ret = _CodedInputStream->ReadVarint32(&size) )
{
CodedInputStream::Limit msgLimit = _CodedInputStream->PushLimit(size);
if ( ret = msg.ParseFromCodedStream(_CodedInputStream) )
{
_CodedInputStream->PopLimit(msgLimit);
std::cout << mFeed << " FASReader ReadNext: " << msg.DebugString() << std::endl;
}
}
return ret;
}
~FASReader()
{
delete _CodedInputStream;
delete _IstreamInputStream;
mFs.close();
}
};

I ran into the same issue in both C++ and Python.
For the C++ version, I used a mix of the code Kenton Varda posted on this thread and the code from the pull request he sent to the protobuf team (because the version posted here doesn't handle EOF while the one he sent to github does).
#include <google/protobuf/message_lite.h>
#include <google/protobuf/io/zero_copy_stream.h>
#include <google/protobuf/io/coded_stream.h>
bool writeDelimitedTo(const google::protobuf::MessageLite& message,
google::protobuf::io::ZeroCopyOutputStream* rawOutput)
{
// We create a new coded stream for each message. Don't worry, this is fast.
google::protobuf::io::CodedOutputStream output(rawOutput);
// Write the size.
const int size = message.ByteSize();
output.WriteVarint32(size);
uint8_t* buffer = output.GetDirectBufferForNBytesAndAdvance(size);
if (buffer != NULL)
{
// Optimization: The message fits in one buffer, so use the faster
// direct-to-array serialization path.
message.SerializeWithCachedSizesToArray(buffer);
}
else
{
// Slightly-slower path when the message is multiple buffers.
message.SerializeWithCachedSizes(&output);
if (output.HadError())
return false;
}
return true;
}
bool readDelimitedFrom(google::protobuf::io::ZeroCopyInputStream* rawInput, google::protobuf::MessageLite* message, bool* clean_eof)
{
// We create a new coded stream for each message. Don't worry, this is fast,
// and it makes sure the 64MB total size limit is imposed per-message rather
// than on the whole stream. (See the CodedInputStream interface for more
// info on this limit.)
google::protobuf::io::CodedInputStream input(rawInput);
const int start = input.CurrentPosition();
if (clean_eof)
*clean_eof = false;
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size))
{
if (clean_eof)
*clean_eof = input.CurrentPosition() == start;
return false;
}
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit = input.PushLimit(size);
// Parse the message.
if (!message->MergeFromCodedStream(&input)) return false;
if (!input.ConsumedEntireMessage()) return false;
// Release the limit.
input.PopLimit(limit);
return true;
}
And here is my python2 implementation:
from google.protobuf.internal import encoder
from google.protobuf.internal import decoder
#I had to implement this because the tools in google.protobuf.internal.decoder
#read from a buffer, not from a file-like objcet
def readRawVarint32(stream):
mask = 0x80 # (1 << 7)
raw_varint32 = []
while 1:
b = stream.read(1)
#eof
if b == "":
break
raw_varint32.append(b)
if not (ord(b) & mask):
#we found a byte starting with a 0, which means it's the last byte of this varint
break
return raw_varint32
def writeDelimitedTo(message, stream):
message_str = message.SerializeToString()
delimiter = encoder._VarintBytes(len(message_str))
stream.write(delimiter + message_str)
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message
#In place version that takes an already built protobuf object
#In my tests, this is around 20% faster than the other version
#of readDelimitedFrom()
def readDelimitedFrom_inplace(message, stream):
raw_varint32 = readRawVarint32(stream)
if raw_varint32:
size, _ = decoder._DecodeVarint32(raw_varint32, 0)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message.ParseFromString(data)
return message
else:
return None
It might not be the best looking code and I'm sure it can be refactored a fair bit, but at least that should show you one way to do it.
Now the big problem: It's SLOW.
Even when using the C++ implementation of python-protobuf, it's one order of magnitude slower than in pure C++. I have a benchmark where I read 10M protobuf messages of ~30 bytes each from a file. It takes ~0.9s in C++, and 35s in python.
One way to make it a bit faster would be to re-implement the varint decoder to make it read from a file and decode in one go, instead of reading from a file and then decoding as this code currently does. (profiling shows that a significant amount of time is spent in the varint encoder/decoder). But needless to say that alone is not enough to close the gap between the python version and the C++ version.
Any idea to make it faster is very welcome :)

Just for completeness, I post here an up-to-date version that works with the master version of protobuf and Python3
For the C++ version it is sufficient to use the utils in delimited_message_utils.h, here a MWE
#include <google/protobuf/io/zero_copy_stream_impl.h>
#include <google/protobuf/util/delimited_message_util.h>
#include <stdio.h>
#include <fcntl.h>
#include <unistd.h>
template <typename T>
bool writeManyToFile(std::deque<T> messages, std::string filename) {
int outfd = open(filename.c_str(), O_WRONLY | O_CREAT | O_TRUNC);
google::protobuf::io::FileOutputStream fout(outfd);
bool success;
for (auto msg: messages) {
success = google::protobuf::util::SerializeDelimitedToZeroCopyStream(
msg, &fout);
if (! success) {
std::cout << "Writing Failed" << std::endl;
break;
}
}
fout.Close();
close(outfd);
return success;
}
template <typename T>
std::deque<T> readManyFromFile(std::string filename) {
int infd = open(filename.c_str(), O_RDONLY);
google::protobuf::io::FileInputStream fin(infd);
bool keep = true;
bool clean_eof = true;
std::deque<T> out;
while (keep) {
T msg;
keep = google::protobuf::util::ParseDelimitedFromZeroCopyStream(
&msg, &fin, nullptr);
if (keep)
out.push_back(msg);
}
fin.Close();
close(infd);
return out;
}
For the Python3 version, building on #fireboot 's answer, the only thing thing that needed modification is the decoding of raw_varint32
def getSize(raw_varint32):
result = 0
shift = 0
b = six.indexbytes(raw_varint32, 0)
result |= ((ord(b) & 0x7f) << shift)
return result
def readDelimitedFrom(MessageType, stream):
raw_varint32 = readRawVarint32(stream)
message = None
if raw_varint32:
size = getSize(raw_varint32)
data = stream.read(size)
if len(data) < size:
raise Exception("Unexpected end of file")
message = MessageType()
message.ParseFromString(data)
return message

Was also looking for a solution for this. Here's the core of our solution, assuming some java code wrote many MyRecord messages with writeDelimitedTo into a file. Open the file and loop, doing:
if(someCodedInputStream->ReadVarint32(&bytes)) {
CodedInputStream::Limit msgLimit = someCodedInputStream->PushLimit(bytes);
if(myRecord->ParseFromCodedStream(someCodedInputStream)) {
//do your stuff with the parsed MyRecord instance
} else {
//handle parse error
}
someCodedInputStream->PopLimit(msgLimit);
} else {
//maybe end of file
}
Hope it helps.

Working with an objective-c version of protocol-buffers, I ran into this exact issue. On sending from the iOS client to a Java based server that uses parseDelimitedFrom, which expects the length as the first byte, I needed to call writeRawByte to the CodedOutputStream first. Posting here to hopegully help others that run into this issue. While working through this issue, one would think that Google proto-bufs would come with a simply flag which does this for you...
Request* request = [rBuild build];
[self sendMessage:request];
}
- (void) sendMessage:(Request *) request {
//** get length
NSData* n = [request data];
uint8_t len = [n length];
PBCodedOutputStream* os = [PBCodedOutputStream streamWithOutputStream:outputStream];
//** prepend it to message, such that Request.parseDelimitedFrom(in) can parse it properly
[os writeRawByte:len];
[request writeToCodedOutputStream:os];
[os flush];
}

Since I'm not allowed to write this as a comment to Kenton Varda's answer above; I believe there is a bug in the code he posted (as well as in other answers which have been provided). The following code:
...
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
uint32_t size;
if (!input.ReadVarint32(&size)) return false;
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...
sets an incorrect limit because it does not take into account the size of the varint32 which has already been read from input. This can result in data loss/corruption as additional bytes are read from the stream which may be part of the next message. The usual way of handling this correctly is to delete the CodedInputStream used to read the size and create a new one for reading the payload:
...
uint32_t size;
{
google::protobuf::io::CodedInputStream input(rawInput);
// Read the size.
if (!input.ReadVarint32(&size)) return false;
}
google::protobuf::io::CodedInputStream input(rawInput);
// Tell the stream not to read beyond that size.
google::protobuf::io::CodedInputStream::Limit limit =
input.PushLimit(size);
...

You can use getline for reading a string from a stream, using the specified delimiter:
istream& getline ( istream& is, string& str, char delim );
(defined in the header)

Reading all of a stuct or nothing from a pipe/socket in linux?

I've got a subprocess that I've popened that outputs fixed-sized structs containing some status information. My plan is to have a separate thread that reads from the stdout of that process to pull in the data as it comes.
I've got to check a flag periodically to make sure the program is still running so I can shut down cleanly, so I have to set the pipe to non-blocking and just have to run a loop piecing together the status message.
Is there a canonical way I can tell Linux "either read this entire amount or nothing before a timeout", that way I'll be able to check my flag, but I don't have to handle the boilerplate of reading the structure piece meal?
Alternatively, is there a way to push data back into a pipe? I could try to read the whole thing, and if it times out before it's all ready, push what I have back in and try again in a bit.
I've also written my popen (so I can grab stdin and stdout, so I'm totally OK using a socket rather than a pipe if that helps).

Here's what I ended up doing for anyone that's curious. I just wrote a class that wraps up the file descriptor and message size and gives me the "all-or-none" behavior I want.
struct aonreader {
aonreader(int fd, ssize_t size) {
fd_ = fd;
size_ = size_;
nread_ = 0;
nremain_ = size_;
}
ssize_t read(void *dst) {
ssize_t ngot = read(fd, (char*)dst + nread_, nremain_);
if (ngot < 0) {
if (errno != EAGAIN && errno != EWOULDBLOCK) {
return -1; // error
}
} else {
nread_ += ngot;
nremain_ -= ngot;
// if we read a whole struct
if (nremain_ == 0) {
nread_ = 0;
nremain_ = size_;
return size_;
}
}
return 0;
private:
int fd_;
ssize_t size_;
ssize_t nread_;
ssize_t nremain_;
};
Which can then be used something like this:
thing_you_want thing;
aonreader buffer(fd, sizeof(thing_you_want));
while (running) {
size_t ngot = buffer.read(&thing);
if (ngot == sizeof(thing_you_want)) {
<handle thing>
} else if (ngot < 0) {
<error, handle errno>
}
<otherwise loop and check running flag>
}

Reading Variable length messages in Qtcp readyRead()

The following code is intended to display an Image sent over network. I sent a header of 16 bytes which I use to calculate the size of image that follows and then read that many bytes and display the image.
I used the concept at this link Tcp packets using QTcpSocket
void socket::readyRead()
{
while(socket->bytesAvailable() > 0) {
quint8 Data[16];
socket->read((char *)&Data,16);
img_size = (((quint8)Data[1]<<8)+ (quint8)Data[0]) * (((quint8)Data[3]<<8)+ (quint8)Data[2]) * 1;
QByteArray buffer = socket->read(img_size);
while(buffer.size() < (img_size))
{
// qDebug() << buffer.size();
socket->waitForReadyRead();
buffer.append(socket->read((img_size)-(buffer.size()) ));
}
unsigned char* imgdatara = (unsigned char*)&buffer.data()[0];
if( !image )
image = new QImage(imgdatara,32,640,QImage::Format_Grayscale8);
else
{
delete image;
image = new QImage(imgdatara,32,640,QImage::Format_Grayscale8);
}
emit msg(image);
}
}
My GUI says "not responding". How should I solve this?
Thanks

This is 100% working code from the book of Max Schlee's "Qt 4.8 Professional programming". This is not a simple question, because on the readyRead() signal you can receive:
1. A complete block
2. Only a part of block
3. Several blocks together
void MyClass::onReceive()
{
QDataStream in(m_pClient);
in.setVersion(QDataStream::Qt_4_6); // Your version. Not necessary.
for(;;)
{
if(m_nextBlockSize == 0)
{
if(m_pClient->bytesAvailable() < sizeof(m_nextBlockSize))
{
break;
}
else
{
in >> m_nextBlockSize;
}
}
if(m_pClient->bytesAvailable() < m_nextBlockSize)
{
break;
}
// Here you have each complete block
processYourBlockHere(); // <=====
m_nextBlockSize = 0;
}
}
Update: useful links for you: Serializing Qt Data Types and QDataStream

free() causing stack overflow

I am trying to develop an application with Visual Studio C++ using some communication DLLs .
In one of the DLLs, I have a stack overflow exception.
I have two functions, one receives packet, and another function which do some operations on the packets.
static EEcpError RxMessage(unsigned char SrcAddr, unsigned char SrcPort, unsigned char DestAddr, unsigned char DestPort, unsigned char* pMessage, unsigned long MessageLength)
{
EEcpError Error = ERROR_MAX;
TEcpChannel* Ch = NULL;
TDevlinkMessage* RxMsg = NULL;
// Check the packet is sent to an existing port
if (DestPort < UC_ECP_CHANNEL_NB)
{
Ch = &tEcpChannel[DestPort];
RxMsg = &Ch->tRxMsgFifo.tDevlinkMessage[Ch->tRxMsgFifo.ucWrIdx];
// Check the packet is not empty
if ((0UL != MessageLength)
&& (NULL != pMessage))
{
if (NULL == RxMsg->pucDataBuffer)
{
// Copy the packet
RxMsg->SrcAddr = SrcAddr;
RxMsg->SrcPort = SrcPort;
RxMsg->DestAddr =DestAddr;
RxMsg->DestPort = DestPort;
RxMsg->ulDataBufferSize = MessageLength;
RxMsg->pucDataBuffer = (unsigned char*)malloc(RxMsg->ulDataBufferSize);
if (NULL != RxMsg->pucDataBuffer)
{
memcpy(RxMsg->pucDataBuffer, pMessage, RxMsg->ulDataBufferSize);
// Prepare for next message
if ((UC_ECP_FIFO_DEPTH - 1) <= Ch->tRxMsgFifo.ucWrIdx)
{
Ch->tRxMsgFifo.ucWrIdx = 0U;
}
else
{
Ch->tRxMsgFifo.ucWrIdx += 1U;
}
// Synchronize the application
if (0 != OS_MbxPost(Ch->hEcpMbx))
{
Error = ERROR_NONE;
}
else
{
Error = ERROR_WINDOWS;
}
}
else
{
Error = ERROR_WINDOWS;
}
}
else
{
// That should never happen. In case it happens, that means the FIFO
// is full. Either the FIFO size should be increased, or the listening thread
// does no more process the messages.
// In that case, the last received message is lost (until the messages are processed, or forever...)
Error = ERROR_FIFO_FULL;
}
}
else
{
Error = ERROR_INVALID_PARAMETER;
}
}
else
{
// Trash the packet, nothing else to do
Error = ERROR_NONE;
}
return Error;
}
static EEcpError ProcessNextRxMsg(unsigned char Port, unsigned char* SrcAddr, unsigned char* SrcPort, unsigned char* DestAddr, unsigned char* Packet, unsigned long* PacketSize)
{
EEcpError Error = ERROR_MAX;
TEcpChannel* Ch = &tEcpChannel[Port];
TDevlinkMessage* RxMsg = &Ch->tRxMsgFifo.tDevlinkMessage[Ch->tRxMsgFifo.ucRdIdx];
if (NULL != RxMsg->pucDataBuffer)
{
*SrcAddr = RxMsg->ucSrcAddr;
*SrcPort = RxMsg->ucSrcPort;
*DestAddr = RxMsg->ucDestAddr;
*PacketSize = RxMsg->ulDataBufferSize;
memcpy(Packet, RxMsg->pucDataBuffer, RxMsg->ulDataBufferSize);
// Cleanup the processed message
free(RxMsg->pucDataBuffer); // <= Exception stack overflow after 40 min
RxMsg->pucDataBuffer = NULL;
RxMsg->ulDataBufferSize = 0UL;
RxMsg->ucSrcAddr = 0U;
RxMsg->ucSrcPort = 0U;
RxMsg->ucDestAddr = 0U;
RxMsg->ucDestPort = 0U;
// Prepare for next message
if ((UC_ECP_FIFO_DEPTH - 1) <= Ch->tRxMsgFifo.ucRdIdx)
{
Ch->tRxMsgFifo.ucRdIdx = 0U;
}
else
{
Ch->tRxMsgFifo.ucRdIdx += 1U;
}
Error =ERROR_NONE;
}
else
{
Error = ERROR_NULL_POINTER;
}
return Error;
}
The problem occur after 40 min, during all this time I receive a lot of packets, and everything is going well.
After 40 min, the stack overflow exception occur on the free.
I don't know what is going wrong.
Can anyone help me please ?
Thank you.

A few suggestions:
The line
memcpy(Packet, RxMsg->pucDataBuffer, RxMsg->ulDataBufferSize);
is slightly suspect as it occurs just before the free() call which crashes. How is Packet allocated and how are you making sure a buffer overflow does not occur here?
If this is an asynchronous / multi-threaded program do you have the necessary locks to prevent data from being written/read at the same time?
Best bet if you still need to find the issue is to run a tool like Valgrind to help diagnose and narrow down memory issues more precisely. As dasblinklight mentions in the comments the issue most likely originates somewhere else and merely happens to show up at the free() call.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Why thrift TBinaryProtocol read recv data more complex than just size + content - c++

Related

Xcode app for macOS. This is how I setup to get audio from usb mic input. Worked a year ago, now doesn't. Why

c++ Protocol buffer sending over network [duplicate]

Reading all of a stuct or nothing from a pipe/socket in linux?

Reading Variable length messages in Qtcp readyRead()

free() causing stack overflow

Categories

Resources