Check whether a parsed Protocol Buffers (Protobuf) proto3 message is valid - c++

I currently need to transfer a Protobuf message of a given type Foo over the network and handle it on reception.
I am using Protobuf 3.5.1 and the proto3 language.
On reception I do the following:
Create an empty instance of the expected type Foo (via a Factory Method, since it may change).
Parse the message from a CodedInputStream (constructed from the data received from the network).
Validate whether the parsing has been successful.
Here is a minimal code snippet illustrating the concrete problem:
// Create the empty Protobuf message of the EXPECTED type.
std::shared_ptr<google::protobuf::Message> message = std::make_unique<Foo>();
// Create the CodedInputStream from the network data.
google::protobuf::io::CodedInputStream stream{receive_buffer_.data(), static_cast<int>(bytes_transferred)};
if (message->ParseFromCodedStream(&stream) && stream.ConsumedEntireMessage() && message->IsInitialized()) {
// Message is of correct type and correctly parsed.
} else {
// Message is NOT VALID.
// TODO: This does not work if sending a message of type `Bar`.
}
Everything works flawless if the sender transfers messages of the expected type Foo, but things get weird if sending a message of another type Bar.
Both messages have a similar structure (they only difference is the oneof type which uses different types (but may use identical field identity values):
message Foo {
First first = 1;
oneof second {
A a = 2;
B b = 3;
}
}
message Bar {
First first = 1;
oneof second {
C c = 2;
D d = 3;
}
}
I haven't found a way yet, that detects whether the received Protobuf message type is "valid".
So my question is: How can I safely detect whether the parsed instance of Foo is valid and it is not an instance of another (similarly structured but completely different) Protobuf message type? Do I need to manually add the name of the message sent as a field (that would be stupid, but should work)?

Related

How can I receive temperature messages from an Impinj reader over LLRP?

I am attempting to monitor the temperature of a reader over an LLRP connection. In out_impinj_ltkcpp.h I see a class called CImpinjReaderTemperature that looks mostly boilerplate:
class CImpinjReaderTemperature : public CParameter
{
public:
CImpinjReaderTemperature (void);
~CImpinjReaderTemperature (void);
static const CFieldDescriptor * const
s_apFieldDescriptorTable[];
static const CTypeDescriptor
s_typeDescriptor;
//... clipped for brevity
}
There is an enumeration that looks useful:
enum EImpinjRequestedDataType {
ImpinjRequestedDataType_All_Configuration = 2000, /**< All_Configuration */
ImpinjRequestedDataType_Impinj_Sub_Regulatory_Region = 2001, /**< Impinj_Sub_Regulatory_Region */
ImpinjRequestedDataType_Impinj_GPI_Debounce_Configuration = 2003, /**< Impinj_GPI_Debounce_Configuration */
ImpinjRequestedDataType_Impinj_Reader_Temperature = 2004, /**< Impinj_Reader_Temperature */
//...clipped for brevity
}
First, how are temperature messages received over LLRP, i.e. do reports need to be requested? Does the temperature need to be polled? Second, how do these parameters fit into LLRP? Which message is the correct one to send (CGET_READER_CONFIG, CUSTOM_MESSAGE, something else)?
The documentation on the LLRP protocol— and especially Impinj extensions— is somewhat lacking or locked behind doors whose keys were lost years ago. That said, I was able to find a document that referenced the Impinj:Temperature message and piece things together from there.
First, the temperature response comes as custom part of a CGET_READER_CONFIG_RESPONSE message. This means we need to send a CGET_READER_CONFIG message that requests the custom temperature extension:
CGET_READER_CONFIG *pCmd;
CMessage *pRspMsg;
CGET_READER_CONFIG_RESPONSE *pRsp;
// Compose the command message
pCmd = new CGET_READER_CONFIG();
pCmd->setRequestedData(GetReaderConfigRequestedData_Identification); // This is cheaper than the default of "all"
CImpinjRequestedData * req = new CImpinjRequestedData();
req->setRequestedData(ImpinjRequestedDataType_Impinj_Reader_Temperature);
pCmd->addCustom(req);
Attached to that config request message is a CImpinjRequestData object that encodes a single integer 2004. This is the significance of the enumeration in my question above. After sending this message, the reader will respond with a response including the identifier we asked for. Without asking for the reader's identification, the value of the requestedData will be 0 which corresponds to "all information" and becomes quite a large message.
Along with the identifier is an <Impinj:Temperature> element containing the reader's internal temperature in Celsius. That can be accessed by enumerating the custom field responses (there is only one) and, after checking its type, reading its temperature field:
std::list<CParameter *>::iterator it;
for (it = pRsp->beginCustom(); it != pRsp->endCustom(); it++ )
{
if ((*it)->m_pType == &CImpinjReaderTemperature::s_typeDescriptor)
{
CImpinjReaderTemperature* temp = (CImpinjReaderTemperature*) *it;
if (NULL != temperature_out)
*temperature_out = temp->getTemperature();
}
}
While this may not be the most convenient interface to fetch this information, it is working reliably. It can also serve as an example for other fetching other LLRP extensions.

Is adding fields in a nested protobuf message backwards compatible?

I have a protobuf message roughly like this:
syntax = "proto3";
package state;
message Version
{
uint32 major = 1;
uint32 minor = 2;
uint32 fix = 3;
}
message Parameters
{
optional int32 a = 1;
optional int32 b = 2;
optional bool c = 3;
optional float d = 4;
}
/** A simple tree data structure to hold some state information */
message Tree
{
/** All state value attributes for this node, as a name/value pair map */
map<string, bytes> attributes = 1;
/** All child state trees as a child id / Tree pair map */
map<string, Tree> child_trees = 2;
}
message State
{
string product_name = 1;
Version product_version = 2;
Parameters parameters = 3;
Tree state_tree_a = 4;
Tree state_tree_b = 5;
}
This is already used in shipped software. Now we would like to add further optional fields to the Parameters message (all of them either bool, int or float). This has to be backwards compatible, e.g. a new version of the software should be able to parse a State message written with the older version, where there were less entries in the nested Parameters message. The messages are serialised in C++ via SerializeToArray and restored via ParseFromArray.
I did some tests and this seems to work without problems. Extending the Parameters message did not break the readability of the Tree messages after it and fields not defined in the first version just returned false when checked for existence.
But I won't rely on this limiting amount of try and error testing for shipping code, so I'd like to find some documentation on wether this should be safe by design or if it just worked by pure luck in my tests or only under certain circumstances. In this case, it would be completely okay for me if it worked in a C++ context only.
From my understanding, the Updating A Message Type section from the proto3 language guide does not give a clear answer wether it is safe or not to extend a message which sits nested in the middle of another message without risking to invalidate the data located behind it. But I might be overlooking something or getting something wrong from the documentation. Can anyone point me to a clearly documented answer to my question?

Set oneof in a protobuf message using reflection

After a few hours, I still can not set a oneof field in a clear (just created) protobuf message using reflection in c++.
I can obtain the needed OneOfDescriptor through the Descriptor of the message. But when I try to 'set' the oneof using Reflection, I found the real problem. There is only three function members related to OneOfDescriptor:
HasOneOf to check if there is a previous defined oneof in the message
GetOneofFieldDescriptor to get a FieldDescriptor from a previous defined oneof in the message
ClearOneof (without documentation) to clear oneof.
So there is not a SetOneofFieldDescriptorand if the oneof in the message is not previously defined, using a mutable_XXXX function member in the message, the GetOneofFieldDescriptor returns nullptr.
Therefore I am really stuck and any idea will be welcome.
Thanks in advance.
You set it the same way you would set the field if it weren't part of a oneof. Get a FieldDescriptor from the message's Descriptor and pass it to the appropriate SetXXX method of the message's Reflection.
Given a message like the following:
message Foo
{
oneof bar
{
int32 a = 1;
string b = 2;
}
}
You can set the a member as follows:
#include "foo.pb.h"
int main()
{
Foo f;
const google::protobuf::Descriptor* d = f.GetDescriptor();
const google::protobuf::FieldDescriptor* a = d->FindFieldByName("a");
const google::protobuf::Reflection* r = f.GetReflection();
r->SetInt32(&f, a, 42);
}
Protobuf will take care of making sure any previously set members of the oneof get unset as needed.

Send protocol buffer data via Socket and determine the class

im working into Googles Protocol Buffers right now and have a question. If i have multiple .proto files and thus multiple classes, is it somehow possible when the data is sent over a socket to determine which type it is?
E.g. i have two classes, lets call them person.proto and adress.proto. Now I send one of those over the wire. How can the receiver determine wheather it is a person or an adress?
I am doing this in C++.
My attempt would be adding a frame around the message, containing length and type. But i want to know if there is already some kind of implementation for the type stuff, so i dont reimplement existing stuff.
Yes it is possible. Protobuf supports reflection using so called message descriptors.
But (as stated in the other answer) you'll need a reliable well known root message type. Instead of introducing your own message discrimination mechanism, IMHO it's better to use protobufs extension mechanism
Here's a sample of what we have in production
package Common.ConfigurationCommands;
message UcpConfiguration
{
optional uint32 componentIndex = 1;
optional ConfigCmdStatus configCmdResponseStatus = 2;
optional string configErrorDescription = 3;
extensions 100 to max;
}
The extension looks like
import "Common/ConfigurationCommands.proto";
message AmplifierConfiguration
{
extend Common.ConfigurationCommands.UcpConfiguration
{
optional AmplifierConfiguration amplifierConfiguration = 108;
}
optional uint32 preemphasis = 1;
}
import "Common/ConfigurationCommands.proto";
message FrontendConfiguration
{
extend Common.ConfigurationCommands.UcpConfiguration
{
optional FrontendConfiguration frontendConfiguration = 100;
}
optional bool frontendActive = 1;
optional uint32 refInputComponentIndex = 2;
extensions 100 to max;
}
You can check this part of the documentation to see how to deal with extensions in your C++ code.
It is impossible to detect which object was serialized, Protobuf don't do it. But you can handle that using protobuf very easy:
1) Method: just send message that has type and string body. To body you will serialize your objects, and in type you will show which object is serialized:
Something like that:
package MyGreatPackage;
message Pack
{
required bytes packcode = 1;
//code for data/query
required bytes mess = 2;
}
message Data
{
//anything you need to
}
message Query
{
//anything you need to
}
So, you will always send message Pack, where will be defined which object exactly is in "mess" field.
2) Method: protobuf allows this technique to achieve same thing without pack wrapper, look here: https://developers.google.com/protocol-buffers/docs/techniques?hl=ru#union
message OneMessage {
enum Type { FOO = 1; BAR = 2; BAZ = 3; }
// Identifies which field is filled in.
required Type type = 1;
// One of the following will be filled in.
optional Foo foo = 2;
optional Bar bar = 3;
optional Baz baz = 4;
}
So, you can set all classes you may send as optional and determine their types by required parameter.
Still, for me first varians seems better, choose what you like.

Serialize in C# (protobuf-net) , Deserialize in C++ (protobuf) : More than 5 fields in class

I'm having trouble deserializing an object in C++ that I had serialized in C# and then sent over the network with ZMQ. I'm fairly certain the ZMQ part is working correctly because the C++ server application (Linux) successfully receives the serialized messages from C# (Windows) and sends them back to Windows where it can successfully deserialize the message, so I don't think I'm experiencing any sort of truncated or dropped packets in that regard.
However, when I receive the message on the Linux server, the C++ deserialize method does not correctly deserialize, it throws some a bunch of binary data into the 6th field (I can see this in MyObject.DebugString()), but no data in any other fields. The strange part here, however, is that a class I had with 5 fields works perfectly fine. C++ deserializes it correctly and all of the data is working properly. Below are a few tidbits of my code. Any help would be greatly appreciated.
C#:
MemoryStream stream = new MemoryStream();
ProtoBuf.Serializer.Serialize<TestType>(stream, (TestType)data);
_publisher.Send(stream.ToArray());
C++:
message_t data;
int64_t recv_more;
size_t recv_more_sz = sizeof(recv_more);
TestType t;
bool isProcessing = true;
while(isProcessing)
{
pSubscriber->recv(&data, 0);
t.ParseFromArray((void*)(data.data()),sizeof(t));
cout<<"Debug: "<<t.DebugString()<<endl;
pSubscriber->getsockopt(ZMQ_RCVMORE, &recv_more, &recv_more_sz);
isProcessing = recv_more;
}
The output looks like this:
Debug: f: "4\000\000\000\000\000\"
I'm having trouble copy and pasting, but the output continues like that for probably 3 or 4 lines worth of that.
This is my TestType class (proto file):
package Base_Types;
enum Enumr {
Dog = 0;
Cat = 1;
Fish = 2;
}
message TestType {
required double a = 1;
required Enumr b = 2;
required string c = 3;
required string d = 4;
required double e = 5;
required bytes f = 6;
required string g = 7;
required string h = 8;
required string i = 9;
required string j = 10;
}
Field "f" is listed as bytes because when it was a string before it was giving me a warning about UTF-8 encoding, however, when this class worked with only 5 fields (the enum was one of them), it did not give me that error. It's almost like instead of deserializing, it's throwing the binary for the entire class into field "f" (field 6).
Solution: There ended up being an issue where the memory wasn't being copied before it sent to a thread socket. When the publisher sent back out, it was packaging the data and changing what the router received. There needs to be a memcpy() on the C++ side in order to send out the data to be used internally. Thanks for all of the help.
I've parsed it through the reader in v2, and it seems to make perfect sense:
1=5
2=0
3=
4=yo
5=6
6=2 bytes, 68-69
7=how
8=are
9=you
10=sir
Note that I've done that purely from the hex data (not using the .proto), but it should be close to your original data. But most notably, it seems intact.
So: first thing to do; check that the binary you get at the C++ side is exactly the same as the binary you sent; this is doubly important if you are doing any translations along the way (binary => string, for example - which should be done via base-64).
second thing; if that doesn't work, it is possible that there is a problem in the C++ implementation. It seems unlikely since that is one of google's pets, but nothing is impossible. If the binary comes across intact, but it still behaves oddly, I can try speaking to the C++ folks, to see if one of us has gone cuckoo.