Protobuf C++ required fields are not checked/enforced - c++

I am new to Protocol Buffers and currently I see the following problem:
We use proto2 syntax with Protocol Buffers 3.2.0 library and observe that required fields are not enforced during the serialization.
Here a basic example with Catch C++ tests:
syntax = "proto2";
package simple_test;
message SimpleMessage
{
required string field1 = 1;
required string field2 = 2;
}
#include "simple.pb.h"
#include <catch.hpp>
#include <string>
SCENARIO("Verify that serialization with missing required fields fails", "[protobuf]")
{
GIVEN("a message")
{
simple_test::SimpleMessage m;
WHEN("field1 is set and field2 is not and both are required")
{
m.set_field1("aaa");
THEN("serializing this message to string buffer must fail")
{
std::string buffer;
CHECK_FALSE(m.SerializeToString(&buffer));
REQUIRE(buffer.empty());
}
}
}
}
m.SerializeToString(&buffer) returns true and buffer.empty() is false.
What I know
required fields were removed in Protobuf v3.
My Questions
Is there a setting or any kind of config switch that I can enforce these checks with Protobuf v3?
What happens with the use case:
proto2 Message compiled with Protobuf v3 compiler. This message ends up with partially filled required fields and is going to be send to Protobuf v2 enpoint which enforces the required fields. Does this effectively mean that there were just bytes sent for nothing, because the message is invalid and will be rejected?
Should I downgrade from v3.2.0 to 2.x to disallow sending of incomplete messages at client side?

I came along this documentation which I've overseen before:
Standard Message Methods and there is the documentation for:
Standard Message Methods
Each message class also contains a number of other methods that let you check or manipulate the entire message, including:
bool IsInitialized() const;: checks if all the required fields have been set.
This is what I needed! Protobuf are great again!

Is there a setting or any kind of config switch that I can enforce these checks with Protobuf v3?
Have you tried building protobuf in debug config? I think you'll find that it'll assert in this case.
What happens with the use case:
proto2 Message compiled with Protobuf v3 compiler. This message ends up with partially filled required fields and is going to be send to Protobuf v2 enpoint which enforces the required fields. Does this effectively mean that there were just bytes sent for nothing, because the message is invalid and will be rejected?
In this case, the receiver can ParsePartialFromXXX to received what was sent. He can test message validity with message.has_xxx()
Should I downgrade from v3.2.0 to 2.x to disallow sending of incomplete messages at client side?
I wouldn't. Perhaps write a checker for each message type to assert that each required field is present. Better yet, think about it the proto3 way and treat a missing field as a default value in the receiver. The protobuf documentation advises never adding required fields to an existing message type and suggests 'thinking very carefully' before making fields required. You could take this to mean 'oops, the concept of required fields was an error'.

Related

Apollo iOS how to handle partial decoding failure

I'm trying to see if there is a way to do more robust handling of partial decoding failures of Apollo generated Swift classes. Currently, if even one field of one object in an array fails to parse from the network response, the entire collection of objects fails to parse and our iOS client gets no data.
Our graphql is defined something like:
query mobile_getCollections() {
getCollections() {
// ... other fields
items {
activeRange {
expires // Int!
starts // Int!
}
}
}
}
So the Apollo generated Swift code is expecting non-nil Ints when decoding these values. However, due to a backend error (that we would like to make the mobile clients more resilient to), the API will occasionally send us a malformed date String instead of a unix timestamp Int. This causes parsing of the entire mobile_getCollections result to fail, because the Apollo generated query class typing can't be perfectly satisfied.
Ideally, I'd like to just throw out the one item in the collection that failed to be parsed correctly and leave the remaining items intact. Is it possible to do something like that using Apollo?
(Yes, I know the real answer here is to fix the backend, but is there anything I can do in the mean time to more gracefully handle similar partial parsing failure issues?)

Checking for duplicates in VEINS

I am new to Veins and trying to implement a mechanism to detect if the WSM packet was received before. I am using the "psid" as the main variable to identify the packet - is it correct?
Will this type of code work? :
bool MyVeinsApp::msgReceivedBefore(int psid){
/*
This function will be used to determine if the message was received before
and should be discarded or processed further
*/
if(msg_log.find(psid) == msg_log.end()){
return false
}
else {
return true;
}
}
Here msg.log is a C++ data structure storing WSMs based on psid.
The psid only is an identifier for the service you are using (see WaveShortMessage.msg) and therefore not unique among messages of the same service. To distinguish between messages you need a unique message identifier.
A simple approach would be to use the id which every module in OMNeT++ gets:
msg->getId()
UPDATE: Please note that this id also is unique among all messages with the same content (see comment down below).

c++ quickfix failure to send

I'm having an unexpected issue with a c++ quickfix client application using FIX 4.4. I form marketdatarequest and populate it and then call send which returns true. The message is not found in the message or event log files.
No error seems to be reported - what could be happening?
FIX44::MarketDataRequest request(FIX::MDReqID(tmp)
, FIX::SubscriptionRequestType('1')
, FIX::MarketDepth(depth)); // 0 is full depth
FIX::SubscriptionRequestType subType(FIX::SubscriptionRequestType_SNAPSHOT);
FIX44::MarketDataRequest::NoRelatedSym symbolGroup;
symbolGroup.set(FIX::Symbol(I.subID));
request.addGroup(symbolGroup);
FIX::Header &header = request.getHeader();
header.setField(FIX::SenderCompID(sessionSenderID));
header.setField(FIX::TargetCompID(sessionTargetID));
if (FIX::Session::sendToTarget(request) == false)
return false;
My FixConfig looks like:
[DEFAULT]
HeartBtInt=30
ResetOnLogout=Y
ResetOnLogon=Y
ResetOnDisconnect=Y
ConnectionType=initiator
UseDataDictionary=Y
FileLogPath=logs
[SESSION]
FileLogPath=logs
BeginString=FIX.4.4
DataDictionary=XXXXX
ConnectionType=initiator
ReconnectInterval=60
TargetCompID=tCompID
SenderCompID=sCompID
SocketConnectPort=123456
SocketConnectHost=XX.XX.XXX.XX
SocketConnectProtocol=TCP
StartTime=01:05:00
EndTime=23:05:30
FileLogPath=logs
FileStorePath=logs
SocketUseSSL=N
thanks for any help,
Mark
Mark, just couple of notes not really related to your question but which you may found useful:
you dont have to explicitly set TargetCompId/SenderCompId for each message, engine will do it for you.
Do not place logic into callbacks(like you did with market data subscription in onLogon). Better create additional thread which will consume events from you listener, make decisions and take an action.

invalid characters in JSON message after using await/job in play framework 1.2.5

I am using play framework 1.2.5 jobs - after await, I send a message to the web UI in JSON format. The same JSON logic works fine when not using jobs - however, after using jobs and await, the JSON message appears to contain invalid characters (client side javascript does not recognize it as valid JSON anymore). The browser does not render the garbled/invalid characters - I will try using wireshark and see if I can add more details. Any ideas on what could be causing this and how best to prevent this? Thanks in advance (I'm reasonably sure its my code causing the problem since I can't be the first one doing this). I will also try to test using executors/futures instead of play jobs and see how that goes.
Promise<String> testChk = new TestJobs(testInfo, "validateTest").now(); //TestJobs extends Job<String> & I'm overriding doJobWithResult. Also, constructor for TestJobs takes two fields (type for testInfo & String)
String testChkResp = await(testChk);
renderJSON(new TestMessage("fail", "failure message")); //TestMessage class has two String fields and is serializable
Update: I am using gson & JDK1.6
Update It seems that there is a problem with encoding whenever I use play jobs and renderJSON.
TestMessage: (works when not using jobs)
import java.io.Serializable;
public class TestMessage {
public String status;
public String response;
public TestMessage() {
}
public TestMessage(String status, String response) {
this.status = status;
this.response = response;
}
}
Update:
Even using the following results in utf-8 impact when using while relying on jobs.
RenderJSON("test");
Sounds like it could be a bug. It may be related to your template - does it specify the encoding explicitly?
What format is the response? You can determine this by using the inspector in chrome or Web Console in Firefox.
(Though I certainly agree the behaviour should be consistent - it may be worth filing a bug here: http://play.lighthouseapp.com/projects/57987-play-framework/tickets )
It's a workaround; first reset the outputstream then render.
response.reset();
response.contentType="application/json; charset=utf-8";
renderJSON("play has some bugs")
I was able to use futures & callables with executors and the same code as mentioned above works (using play 1.2.5). The only difference was that I was not explicitly using play jobs (and hence the issue does not appear to be related to gson).

How does Sentry aggregate errors?

I am using Sentry (in a django project), and I'd like to know how I can get the errors to aggregate properly. I am logging certain user actions as errors, so there is no underlying system exception, and am using the culprit attribute to set a friendly error name. The message is templated, and contains a common message ("User 'x' was unable to perform action because 'y'"), but is never exactly the same (different users, different conditions).
Sentry clearly uses some set of attributes under the hood to determine whether to aggregate errors as the same exception, but despite having looked through the code, I can't work out how.
Can anyone short-cut my having to dig further into the code and tell me what properties I need to set in order to manage aggregation as I would like?
[UPDATE 1: event grouping]
This line appears in sentry.models.Group:
class Group(MessageBase):
"""
Aggregated message which summarizes a set of Events.
"""
...
class Meta:
unique_together = (('project', 'logger', 'culprit', 'checksum'),)
...
Which makes sense - project, logger and culprit I am setting at the moment - the problem is checksum. I will investigate further, however 'checksum' suggests that binary equivalence, which is never going to work - it must be possible to group instances of the same exception, with differenct attributes?
[UPDATE 2: event checksums]
The event checksum comes from the sentry.manager.get_checksum_from_event method:
def get_checksum_from_event(event):
for interface in event.interfaces.itervalues():
result = interface.get_hash()
if result:
hash = hashlib.md5()
for r in result:
hash.update(to_string(r))
return hash.hexdigest()
return hashlib.md5(to_string(event.message)).hexdigest()
Next stop - where do the event interfaces come from?
[UPDATE 3: event interfaces]
I have worked out that interfaces refer to the standard mechanism for describing data passed into sentry events, and that I am using the standard sentry.interfaces.Message and sentry.interfaces.User interfaces.
Both of these will contain different data depending on the exception instance - and so a checksum will never match. Is there any way that I can exclude these from the checksum calculation? (Or at least the User interface value, as that has to be different - the Message interface value I could standardise.)
[UPDATE 4: solution]
Here are the two get_hash functions for the Message and User interfaces respectively:
# sentry.interfaces.Message
def get_hash(self):
return [self.message]
# sentry.interfaces.User
def get_hash(self):
return []
Looking at these two, only the Message.get_hash interface will return a value that is picked up by the get_checksum_for_event method, and so this is the one that will be returned (hashed etc.) The net effect of this is that the the checksum is evaluated on the message alone - which in theory means that I can standardise the message and keep the user definition unique.
I've answered my own question here, but hopefully my investigation is of use to others having the same problem. (As an aside, I've also submitted a pull request against the Sentry documentation as part of this ;-))
(Note to anyone using / extending Sentry with custom interfaces - if you want to avoid your interface being use to group exceptions, return an empty list.)
See my final update in the question itself. Events are aggregated on a combination of 'project', 'logger', 'culprit' and 'checksum' properties. The first three of these are relatively easy to control - the fourth, 'checksum' is a function of the type of data sent as part of the event.
Sentry uses the concept of 'interfaces' to control the structure of data passed in, and each interface comes with an implementation of get_hash, which is used to return a hash value for the data passed in. Sentry comes with a number of standard interfaces ('Message', 'User', 'HTTP', 'Stacktrace', 'Query', 'Exception'), and these each have their own implemenation of get_hash. The default (inherited from the Interface base class) is a empty list, which would not affect the checksum.
In the absence of any valid interfaces, the event message itself is hashed and returned as the checksum, meaning that the message would need to be unique for the event to be grouped.
I've had a common problem with Exceptions. Currently our system is capturing only exceptions and I was confused why some of these where merged into a single error, others are not.
With your information above I extraced the "get_hash" methods and tried to find the differences "raising" my errors. What I found out is that the grouped errors all came from a self written Exception type that has an empty Exception.message value.
get_hash output:
[<class 'StorageException'>, StorageException()]
and the multiple errors came from an exception class that has a filled message value (jinja template engine)
[<class 'jinja2.exceptions.UndefinedError'>, UndefinedError('dict object has no attribute LISTza_*XYZ*',)]
Different exception messages trigger different reports, in my case the merge was caused due to the lack of the Exception.message value.
Implementation:
class StorageException(Exception):
def __init__(self, value):
Exception.__init__(self)
self.value = value