c++ protobuf: how to iterate through fields of message? - c++

I'm new to protobuf and I'm stuck with simple task: I need to iterate through fields of message and check it's type. If type is message I will do same recursively for this message.
For example, I have such messages:
package MyTool;
message Configuration {
required GloablSettings globalSettings = 1;
optional string option1 = 2;
optional int32 option2 = 3;
optional bool option3 = 4;
}
message GloablSettings {
required bool option1 = 1;
required bool option2 = 2;
required bool option3 = 3;
}
Now, to explicitly access a field value in C++ I can do this:
MyTool::Configuration config;
fstream input("config", ios::in | ios::binary);
config.ParseFromIstream(&input);
bool option1val = config.globalSettings().option1();
bool option2val = config.globalSettings().option2();
and so on. This approach is not convenient in case when have big amount of fields.
Can I do this with iteration and get field's name and type? I know there are descriptors of type and somewhat called reflection, but I didn't have success in my attempts.
Can some one give me example of code if it's possible?
Thanks!

This is old but maybe someone will benefit. Here is a method that prints the contents of a protobuf message:
void Example::printMessageContents(std::shared_ptr<google::protobuf::Message> m)
{
const Descriptor *desc = m->GetDescriptor();
const Reflection *refl = m->GetReflection();
int fieldCount= desc->field_count();
fprintf(stderr, "The fullname of the message is %s \n", desc->full_name().c_str());
for(int i=0;i<fieldCount;i++)
{
const FieldDescriptor *field = desc->field(i);
fprintf(stderr, "The name of the %i th element is %s and the type is %s \n",i,field->name().c_str(),field->type_name());
}
}
You can find in FieldDescriptor Enum Values the possible values you get from field->type. For example for the message type you would have to check if type is equal to FieldDescriptor::TYPE_MESSAGE.
This function prints all the "metadata" of the protobuf message. However you need to check separately for each value what the type is and then call the corresponding getter function using Reflection.
So using this condition we could extract the strings :
if(field->type() == FieldDescriptor::TYPE_STRING && !field->is_repeated())
{
std::string g= refl->GetString(*m, field);
fprintf(stderr, "The value is %s ",g.c_str());
}
However fields can be either repeated or not repeated and different methods are used for both field types. So a check is used here to assure that we are using the right method. For repeated fields we have for example this method for strings :
GetRepeatedString(const Message & message, const FieldDescriptor * field, int index)
So it takes the index of the repeated field into consideration.
In the case of FieldDescriptor of type Message, the function provided will only print the name of the message, we better print its contents too.
if(field->type()==FieldDescriptor::TYPE_MESSAGE)
{
if(!field->is_repeated())
{
const Message &mfield = refl->GetMessage(*m, field);
Message *mcopy = mfield.New();
mcopy->CopyFrom(mfield);
void *ptr = new std::shared_ptr<Message>(mcopy);
std::shared_ptr<Message> *m =
static_cast<std::shared_ptr<Message> *>(ptr);
printMessageContents(*m);
}
}
And finally if the field is repeated you will have to call the FieldSize method on the reflection and iterate all repeated fields.

Take a look at how the Protobuf library implements the TextFormat::Printer class, which uses descriptors and reflection to iterate over fields and convert them to text:
https://github.com/google/protobuf/blob/master/src/google/protobuf/text_format.cc#L1473

Related

Memory issue with using C++ Protobuf in a DLL

This question was posted on Google Group's Protobuf group but unfortunately it has gone unanswered.
A DLL has been created that will accept any Protobuf message which through the use of reflection will be filled with data from a database and then returned to the client. The database table name is mapped to the message name and the corresponding table columns are mapped to the message member names.
The issue appears to be with the 'SetString' method of the reflection class. After the call to the DLL has been made and the protobuf message with all the relevant has been received, a memory error is encountered as the program exists.
A typical message will have a .proto file such as the following:
message OperatingSystemInfo {
string manufacturer;
string name;
string release;
string osversion;
string localename;
string bootdevice;
string serialnumber;
}
The code snippets for the DLL functions are as follow:
void EXPORT_DLL ReadMessage(void* protoBufMessage) {
Message* msg = static_cast<Message*>(protoBufMessage);
const Descriptor* descriptor = msg->GetDescriptor();
const Reflection* reflection = msg->GetReflection();
for (int i = 0; i < descriptor->field_count(); i++) {
const FieldDescriptor* field = descriptor->field(i);
const FieldDescriptor::Type type = field->type();
bool isRepeated = field->is_repeated();
std::string name = field->name();
if (type == FieldDescriptor::TYPE_MESSAGE) {
}
else {
//read values from database for each corresponding field and set value
string fieldValue;
SetFieldValues(reflection, field, msg, fieldValue);
}
}
}
void SetFieldValues(const google::protobuf::Reflection reflection,
const google::protobuf::FieldDescriptor field,
google::protobuf::Message msg,
std::string fieldValue) {
switch (field->type()) {
case FieldDescriptor::TYPE_STRING:
reflection->SetString(msg, field, move(fieldValue))
break;
case FieldDescriptor::TYPE_UINT32:
break;
case FieldDescriptor::TYPE_UINT64:
break;
case FieldDescriptor::TYPE_BOOL:
break;
case FieldDescriptor::TYPE_ENUM:
break;
default:
break;
}
}
The client call to the DLL is:
void CallDLL()
{
OperatingSystemInfo* osi = new OperatingSystemInfo();
ReadMessage(osi);
delete osi;
}
The problem is that when the application is about to terminate, the 'Destroy' method for the last string in the message, 'serialnumber', causes a memory error (__acrt_first_block == header in debug_heap.ccp):
inline void OperatingSystemState::SharedDtor() {
GOOGLE_DCHECK(GetArenaForAllocation() == nullptr);
_impl_.manufacturer_.Destroy();
_impl_.name_.Destroy();
_impl_.release_.Destroy();
_impl_.osversion_.Destroy();
_impl_.localename_.Destroy();
_impl_.bootdevice_.Destroy();
_impl_.serialnumber_.Destroy();
}
void ArenaStringPtr::Destroy() {
//Line 233 in arenastring.cc
//leads to assertion Expession: __acrt_first_block == header in debug_heap.ccp
delete tagged_ptr_.GetIfAllocated();
}
Naturally this is an attempt to free an already freed memory. It is likely that allocating the memory outside of the DLL is a factor but the nature of the project is such that the DLL has no knowledge of Protobuf message types. Hence the reason that reflection is used.
Any helps or hints in resolving this issue is greatly appreciated.

How to use reflection of Protobuf to modify a Map

I'm working with Protobuf3 in my C++14 project. There have been some functions, which returns the google::protobuf::Message*s as a rpc request, what I need to do is to set their fields. So I need to use the reflection of Protobuf3.
Here is a proto file:
syntax="proto3";
package srv.user;
option cc_generic_services = true;
message BatchGetUserInfosRequest {
uint64 my_uid = 1;
repeated uint64 peer_uids = 2;
map<string, string> infos = 3;
}
message BatchGetUserInfosResponse {
uint64 my_uid = 1;
string info = 2;
}
Service UserSrv {
rpc BatchGetUserInfos(BatchGetUserInfosRequest) returns (BatchGetUserInfosResponse);
};
Now I called a function, which returns a google::protobuf::Message*, pointing an object BatchGetUserInfosRequest and I try to set its fields.
// msg is a Message*, pointing to an object of BatchGetUserInfosRequest
auto descriptor = msg->GetDescriptor();
auto reflection = msg->GetReflection();
auto field = descriptor->FindFieldByName("my_uid");
reflection->SetUInt64(msg, field, 1234);
auto field2 = descriptor->FindFieldByName("peer_uids");
reflection->GetMutableRepeatedFieldRef<uint64_t>(msg, field2).CopyFrom(peerUids); // peerUids is a std::vector<uint64_t>
As you see, I can set my_uid and peer_uids as above, but for the field infos, which is a google::protobuf::Map, I don't know how to set it with the reflection mechanism.
If you dig deep into the source code, you would find out the map in proto3 is implemented on the RepeatedField:
// Whether the message is an automatically generated map entry type for the
// maps field.
//
// For maps fields:
// map<KeyType, ValueType> map_field = 1;
// The parsed descriptor looks like:
// message MapFieldEntry {
// option map_entry = true;
// optional KeyType key = 1;
// optional ValueType value = 2;
// }
// repeated MapFieldEntry map_field = 1;
//
// Implementations may choose not to generate the map_entry=true message, but
// use a native map in the target language to hold the keys and values.
// The reflection APIs in such implementations still need to work as
// if the field is a repeated message field.
//
// NOTE: Do not set the option in .proto files. Always use the maps syntax
// instead. The option should only be implicitly set by the proto compiler
// parser.
optional bool map_entry = 7;
Inspired by the test code from protobuf, this works for me:
BatchGetUserInfosRequest message;
auto *descriptor = message.GetDescriptor();
auto *reflection = message.GetReflection();
const google::protobuf::FieldDescriptor *fd_map_string_string =
descriptor->FindFieldByName("infos");
const google::protobuf::FieldDescriptor *fd_map_string_string_key =
fd_map_string_string->message_type()->map_key();
const google::protobuf::FieldDescriptor *fd_map_string_string_value =
fd_map_string_string->message_type()->map_value();
const google::protobuf::MutableRepeatedFieldRef<google::protobuf::Message>
mmf_string_string =
reflection->GetMutableRepeatedFieldRef<google::protobuf::Message>(
&message, fd_map_string_string);
std::unique_ptr<google::protobuf::Message> entry_string_string(
google::protobuf::MessageFactory::generated_factory()
->GetPrototype(fd_map_string_string->message_type())
->New(message.GetArena()));
entry_string_string->GetReflection()->SetString(
entry_string_string.get(), fd_map_string_string->message_type()->field(0),
"1234");
entry_string_string->GetReflection()->SetString(
entry_string_string.get(), fd_map_string_string->message_type()->field(1),
std::to_string(10));
mmf_string_string.Add(*entry_string_string);
std::cout << "1234: " << message.infos().at("1234") << '\n';
The output:
1234: 10

Import CSV into Vertica using Rfc4180CsvParser and exclude header row

Is there a way to exclude the header row when importing data via the Rfc4180CsvParser? The COPY command has a SKIP option but the option doesn't seem to work when using the CSV parsers provided in the Vertica SDK.
Background
As background, the COPY command does not read CSV files by itself. For simple CSV files, one can say COPY schema.table FROM '/data/myfile.csv' DELIMITER ',' ENCLOSED BY '"'; but this will fail with data files which have string values with embedded quotes.
Adding ESCAPE AS '"' will generate an error ERROR 3169: ENCLOSED BY and ESCAPE AS can not be the same value . This is a problem as CSV values are enclosed and escaped by ".
Vertica SDK CsvParser extensions to the rescue
Vertica provides an SDK under /opt/vertica/sdk/examples with C++ programs that can be compiled into extensions. One of these is /opt/vertica/sdk/examples/ParserFunctions/Rfc4180CsvParser.cpp.
This works great as follows:
cd /opt/vertica/sdk/examples
make clean
vsql
==> CREATE LIBRARY Rfc4180CsvParserLib AS '/opt/vertica/sdk/examples/build/Rfc4180CsvParser.so';
==> COPY myschema.mytable FROM '/data/myfile.csv' WITH PARSER Rfc4180CsvParser();
Problem
The above works great except that it imports the first row of the data file as a row. The COPY command has a SKIP 1 option but this does not work with the parser.
Question
Is it possble to edit Rfc4180CsvParser.cpp to skip the first row, or better yet, take some parameter to specify number of rows to skip?
The program is just 135 lines but I don't see where/how to make this incision. Hints?
Copying the entire program below as I don't see a public repo to link to...
Rfc4180CsvParser.cpp
/* Copyright (c) 2005 - 2012 Vertica, an HP company -*- C++ -*- */
#include "Vertica.h"
#include "StringParsers.h"
#include "csv.h"
using namespace Vertica;
// Note, the class template is mostly for demonstration purposes,
// so that the same class can use each of two string-parsers.
// Custom parsers can also just pick a string-parser to use.
/**
* A parser that parses something approximating the "official" CSV format
* as defined in IETF RFC-4180: <http://tools.ietf.org/html/rfc4180>
* Oddly enough, many "CSV" files don't actually conform to this standard
* for one reason or another. But for sources that do, this parser should
* be able to handle the data.
* Note that the CSV format does not specify how to handle different
* data types; it is entirely a string-based format.
* So we just use standard parsers based on the corresponding column type.
*/
template <class StringParsersImpl>
class LibCSVParser : public UDParser {
public:
LibCSVParser() : colNum(0) {}
// Keep a copy of the information about each column.
// Note that Vertica doesn't let us safely keep a reference to
// the internal copy of this data structure that it shows us.
// But keeping a copy is fine.
SizedColumnTypes colInfo;
// An instance of the class containing the methods that we're
// using to parse strings to the various relevant data types
StringParsersImpl sp;
/// Current column index
size_t colNum;
/// Parsing state for libcsv
struct csv_parser parser;
// Format strings
std::vector<std::string> formatStrings;
/**
* Given a field in string form (a pointer to the first character and
* a length), submit that field to Vertica.
* `colNum` is the column number from the input file; how many fields
* it is into the current record.
*/
bool handleField(size_t colNum, char* start, size_t len) {
if (colNum >= colInfo.getColumnCount()) {
// Ignore column overflow
return false;
}
// Empty colums are null.
if (len==0) {
writer->setNull(colNum);
return true;
} else {
return parseStringToType(start, len, colNum, colInfo.getColumnType(c
olNum), writer, sp);
}
}
static void handle_record(void *data, size_t len, void *p) {
static_cast<LibCSVParser*>(p)->handleField(static_cast<LibCSVParser*>(p)
->colNum++, (char*)data, len);
}
static void handle_end_of_row(int c, void *p) {
// Ignore 'c' (the terminating character); trust that it's correct
static_cast<LibCSVParser*>(p)->colNum = 0;
static_cast<LibCSVParser*>(p)->writer->next();
}
virtual StreamState process(ServerInterface &srvInterface, DataBuffer &input
, InputState input_state) {
size_t processed;
while ((processed = csv_parse(&parser, input.buf + input.offset, input.s
ize - input.offset,
handle_record, handle_end_of_row, this)) > 0) {
input.offset += processed;
}
if (input_state == END_OF_FILE && input.size == input.offset) {
csv_fini(&parser, handle_record, handle_end_of_row, this);
return DONE;
}
return INPUT_NEEDED;
}
virtual void setup(ServerInterface &srvInterface, SizedColumnTypes &returnTy
pe);
virtual void destroy(ServerInterface &srvInterface, SizedColumnTypes &return
Type) {
csv_free(&parser);
}
};
template <class StringParsersImpl>
void LibCSVParser<StringParsersImpl>::setup(ServerInterface &srvInterface, Sized
ColumnTypes &returnType) {
csv_init(&parser, CSV_APPEND_NULL);
colInfo = returnType;
}
template <>
void LibCSVParser<FormattedStringParsers>::setup(ServerInterface &srvInterface,
SizedColumnTypes &returnType) {
csv_init(&parser, CSV_APPEND_NULL);
colInfo = returnType;
if (formatStrings.size() != returnType.getColumnCount()) {
formatStrings.resize(returnType.getColumnCount(), "");
}
sp.setFormats(formatStrings);
}
template <class StringParsersImpl>
class LibCSVParserFactoryTmpl : public ParserFactory {
public:
virtual void plan(ServerInterface &srvInterface,
PerColumnParamReader &perColumnParamReader,
PlanContext &planCtxt) {}
virtual UDParser* prepare(ServerInterface &srvInterface,
PerColumnParamReader &perColumnParamReader,
PlanContext &planCtxt,
const SizedColumnTypes &returnType)
{
return vt_createFuncObj(srvInterface.allocator,
LibCSVParser<StringParsersImpl>);
}
};
typedef LibCSVParserFactoryTmpl<StringParsers> LibCSVParserFactory;
RegisterFactory(LibCSVParserFactory);
typedef LibCSVParserFactoryTmpl<FormattedStringParsers> FormattedLibCSVParserFac
tory;
RegisterFactory(FormattedLibCSVParserFactory);
The quick and dirty way would be to just hardcode it. It's using a callback to handle_end_of_row. Track the row number and just don't process the first row . Something like:
static void handle_end_of_row(int c, void *ptr) {
// Ignore 'c' (the terminating character); trust that it's correct
LibCSVParser *p = static_cast<LibCSVParser*>(ptr);
p->colNum = 0;
if (rowcnt <= 0) {
p->bad_field = "";
rowcnt++;
} else if (p->bad_field.empty()) {
p->writer->next();
} else {
// libcsv doesn't give us the whole row to reject.
// So just write to the log.
// TODO: Come up with something more clever.
if (p->currSrvInterface) {
p->currSrvInterface->log("Invalid CSV field value: '%s' Row skipped.",
p->bad_field.c_str());
}
p->bad_field = "";
}
}
Also, best to initialize rownum = 0 in process since I think it will call this for each file in your COPY statement. There might be more clever ways of doing this. Basically, this will just process the record and then discard it.
As for supporting SKIP generically... look at TraditionalCSVParser for how to handle parameter passing. You'd have to add it to the parser factor prepare and send in the value to the LibCSVParser class and override getParameterType. Then in LibCSVParser you need to accept the parameter in the constructor, and modify process to skip the first skip rows. Then use that value instead of the hardcoded 0 above.

PROTOBUFF INT64 check aganist previously entered data c++

message HealthOccurrenceCount
{
required int64 HealthID=1;
required int32 OccCount=2;
optional bytes wci=3;
}
I would like to add data based on HealthID; If HealthID is already entered then instead of adding a new entry, the program should instead just increment the existing entry's OccCount.
HealthOccurrenceCount objHelthOccCount;
if(objHelthOccCount.healthid() == healthID) // Is this right or do I need to iterate all the nodes?
{
occCount++;
objHelthOccCount.set_occcount(occCount);
}
else
occCount = 1;
Is this code correct or I should convert HealthID into string?
Generated Code:
// required int64 HealthID = 1;
inline bool has_healthid() const;
inline void clear_healthid();
static const int kHealthIDFieldNumber = 1;
inline ::google::protobuf::int64 healthid() const;
inline void set_healthid(::google::protobuf::int64 value);
According to the doc there is a has_ methods for each singular (required or optional) field which return true if that field has been set.
Your code would then be something like:
HealthOccurrenceCount objHelthOccCount;
if(objHelthOccCount.has_healthid())
{
occCount++;
objHelthOccCount.set_occcount(occCount);
}
else
occCount = 1;

validating structures - how to pass string to sizeof and offset methods?

My program contains auto-generated structures. When starting I need to validate them with "server" information - so I can be sure that my auto-generated structures are up to date. Server and local structures are valid if they are of the same size, contain fields with the same name and size (and type ideally should be validated too).
This is what I wrote so far:
void SchemeValidator::Validate(cg_scheme_desc_t* schemedesc, Logger& logger)
{
struct cg_message_desc_t* msgdesc = schemedesc->messages;
while (msgdesc)
{
struct cg_field_desc_t* fielddesc = msgdesc->fields;
char* structName = msgdesc->name;
size_t structSize = msgdesc->size;
logger.Debug("Message %s, block size = %d", structName, structSize);
if (strcmp(structName, "orders")) {
if (sizeof(orders) != structSize) {
printf("Validator error, structure 'orders', local size = %d server size = %d!", sizeof(orders), structSize);
throw std::exception("Validator error, structure 'orders' wrong size!");
}
while (fielddesc)
{
logger.Debug("\tField %s = %s [size=%d, offset=%d]", fielddesc->name, fielddesc->type, fielddesc->size, fielddesc->offset);
if (offsetof(struct orders, fielddesc->name) != fielddesc->offset) {
throw std::exception("orders structure offset wrong");
}
// TODO: validate fielddesc->size == sizeof corresponding field in structure
fielddesc = fielddesc->next;
}
} else {
throw std::exception("Validator error, validation not implemented!");
}
msgdesc = msgdesc->next;
}
}
There are a lot of problems:
I wrote if (strcmp(structName, "orders")) because later i need to use orders in several expressions, including sizeof(orders) and offsetof(struct orders, fielddesc->name). But I have a lot of structures and for each of them I have to copy-paste this block. Can I somehow pass string literal to sizeof and offsetof methods or have desired effect some another way?
offsetof(struct orders, fielddesc->name) doesn't work by same reason - second parameter can not be string literal, I receive error C2039: 'fielddesc' : is not a member of 'orders' error
By the same reason I can validate fielddesc->size.
How can I achieve desired validation without intensive copy-pasting and values hard-coding?
I believe you can accomplish what you want by defining structs with typedefs and using a templated function. The following should replace what you have (obviously not tested)
struct OrderType {
typedef cg_field_desc_t* ListItem;
typedef orders Class;
static const std::string Name;
};
std::string OrderType::Name = "orders";
template <class T>
checkStuff(T::ListItem fielddesc, const char* name, size_t structSize)
{
if (strcmp(name, T::Name.c_str())) {
if (sizeof(T::Class) != structSize) {
printf("Validator error, structure 'orders', local size = %d server size = %d!", sizeof(T::Class), structSize);
throw std::exception("Validator error, structure 'orders' wrong size!");
}
while (fielddesc)
{
logger.Debug("\tField %s = %s [size=%d, offset=%d]", fielddesc->name, fielddesc->type, fielddesc->size, fielddesc->offset);
if (offsetof(T::Class, fielddesc->name) != fielddesc->offset) {
throw std::exception("orders structure offset wrong");
}
// TODO: validate fielddesc->size == sizeof corresponding field in structure
fielddesc = fielddesc->next;
}
} else {
throw std::exception("Validator error, validation not implemented!");
}
}
You then replace your if statement with
checkStuff<OrderType>(fielddesc, name, structSize);
You can extend this to other types by defining new structs
struct OtherType {
typedef some_other_t* ListItem;
typedef bizbaz Class;
static const std::string Name;
};
std::string OtherType::Name = "foobar";