GzipOutputStream fails to serialise to a string using ProtoBuf - c++

I'm trying to incorporate Protocol buffers in my project. I have created the following very simple schema:
syntax = "proto3";
message Document{
string title = 1;
int64 size = 2;
int64 data = 3;
}
Then in my C++ code (after compiling with the protobufc compiler) I use this as:
Document document;
document.set_title(random_string(20));
document.set_data(300);
document.set_size(500);
std::string docString = document.SerializeAsString();
std::string compressedString;
google::protobuf::io::StringOutputStream stream(&compressedString);
google::protobuf::io::GzipOutputStream gStream(&stream);
document.SerializeToZeroCopyStream(&gStream);
std::cout << docString.size() << std::endl;
std::cout << compressedString.size() << std::endl;
The output of the code above is:
28 0
Thus the compressed string is empty, while the normal string is 28. What is the correct way of using GzipOutputStream and compress the serialised data of a protocol buffer.

Related

C++ iterate avro schema and map it to Key Value (name , type)

I'm learning Avro & C++ (both together :) ) and what i'm trying to do is:
Load a schema.
Map the schema fields to key / value paris of name & type.
Then iterate the avro data according to the mapping.
From what I've found and did, I'm extracting the schema from the Avro data file, and handles it as GenricDatum.
When I try to iterate it's fields - I get the name ok, but the field type is null, where I would expect to get the actual type. any help would be appreciated.
My code:
const char *avroFilePathCstr = avroFilePath.c_str();
avro::DataFileReader<avro::GenericDatum> reader(avroFilePathCstr);
auto dataSchema = reader.dataSchema();
avro::GenericDatum datum(dataSchema);
ProcessAvroSchema(dataSchema);
void ProcessAvroSchema(avro::GenericDatum schema) {
const avro::GenericRecord& schemaRecord = schema.value<avro::GenericRecord>();
for(unsigned int i = 0 ; i < schemaRecord.fieldCount(); i++)
{
avro::GenericDatum fieldDatum = schemaRecord.fieldAt(i);
cout << "SCHEMA:: fieldName: " << schemaRecord.schema() -> nameAt(i) << " , fieldType: " << fieldDatum.type() << "\n";
}
}

protobuf C++ SQLite handle blob data

I have a SQLite database which has a table which contains some fields of BLOB type.
What I am trying to do is fetch the field (in fact all other fields too) from the database into C++ send it through protobuf and receive the protobuf .
I have defined the blob fields as bytes in the .proto file
For example
message fields{
...
bytes myBlobField = 1;
}
My c++ file contains
sqlite3_initialize();
rc = sqlite3_open_v2(db_url, &db,SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE,NULL);
std::ostringstream oss;
oss << "select * from attribtable ";
std::string query = oss.str();
rc = sqlite3_prepare_v2(db,query.c_str(),-1,&stmt,NULL
while(sqlite3_step(stmt) == SQLITE_ROW){
sqlite3_column_blob(stmt,10) //This is the blob field
}
How do I store the sqlite3_column_blob(stmt,10) in C++ and how do I set myBlobField using
say reply->set_myblobfield(??)
and receive on the client side using
say receive->get_myblobfield()
So in simple words my question is how do I send the blobfield fetched from database, through protobuf, from server to client in a C++ application?
Using this .proto file
syntax = "proto2";
package prototest;
message fields{
required bytes myBlobField = 1;
}
You initialize the blob using the set_myblobfield() call with the blob pointer and the byte size of the blob which you get from SQLite and then call the SerializeToOstream() method to write it to a stream or to a file.
std::ofstream myoutput("myoutput.bin");
while (sqlite3_step(stmt) == SQLITE_ROW)
{
if (size_t blobSize = sqlite3_column_bytes(stmt, 10))
{
if (const void* blob = sqlite3_column_blob(stmt, 10))
{
prototest::fields myfields;
myfields.set_myblobfield(blob, blobSize);
myfields.SerializeToOstream(&myoutput);
}
}
}

json serialize c++

I have this C++ code, and am having trouble json serializing it.
string uInput;
string const& retInput;
while(!std::cin.eof()) {
getline(cin, uInput);
JSONExample source; //JSON enabled class from jsonserialize.h
source.text = uInput;
//create JSON from producer
std::string json = JSON::producer<JSONExample>::convert(source); //string -> returns {"JSONExample":{"text":"hi"}}
//then create new instance from a consumer...
JSONExample sink = JSON::consumer<JSONExample>::convert(json);
//retInput = serialize(sink);
// Json::FastWriter fastWriter;
// retInput = fastWriter.write(uInput);
retInput = static_cast<string const&>(uInput);
pubnub::futres fr_2 = pb_2.publish(chan, retInput);
cout << "user input as json which should be published is " << retInput<< std::endl;
while(!cin.eof()) {
getline(cin, uInput);
newInput = "\"\\\"";
newInput += uInput;
newInput += "\\\"\"";
Instead of typing in the message like "\"hi\"", this code takes "hi" and does it.
If the change you described made the "Invalid JSON" disappear, then a "more correct" solution would be, AFAICT, to change the publish() line to:
pubnub::futres fr_2 = pb_2.publish(chan, json);
Because json already has JSON serialized data. Of course, if that JSON is what you want to publish.

How to get protobuf enum as string?

Is it possible to obtain the string equivalent of protobuf enums in C++?
e.g.:
The following is the message description:
package MyPackage;
message MyMessage
{
enum RequestType
{
Login = 0;
Logout = 1;
}
optional RequestType requestType = 1;
}
In my code I wish to do something like this:
MyMessage::RequestType requestType = MyMessage::RequestType::Login;
// requestTypeString will be "Login"
std::string requestTypeString = ProtobufEnumToString(requestType);
The EnumDescriptor and EnumValueDescriptor classes can be used for this kind of manipulation, and the
the generated .pb.h and .pb.cc names are easy enough to read, so you can look through them to get details on the functions they offer.
In this particular case, the following should work (untested):
std::string requestTypeString = MyMessage_RequestType_Name(requestType);
See the answer of Josh Kelley, use the EnumDescriptor and EnumValueDescriptor.
The EnumDescriptor documentation says:
To get a EnumDescriptor
To get the EnumDescriptor for a generated enum type, call
TypeName_descriptor(). Use DescriptorPool to construct your own
descriptors.
To get the string value, use FindValueByNumber(int number)
const EnumValueDescriptor * EnumDescriptor::FindValueByNumber(int number) const
Looks up a value by number.
Returns NULL if no such value exists. If multiple values have this >number,the first one defined is returned.
Example, get the protobuf enum:
enum UserStatus {
AWAY = 0;
ONLINE = 1;
OFFLINE = 2;
}
The code to read the string name from a value and the value from a string name:
const google::protobuf::EnumDescriptor *descriptor = UserStatus_descriptor();
std::string name = descriptor->FindValueByNumber(UserStatus::ONLINE)->name();
int number = descriptor->FindValueByName("ONLINE")->number();
std::cout << "Enum name: " << name << std::endl;
std::cout << "Enum number: " << number << std::endl;

Cassandra cppdriver Query String Buffer Overflow?

I have been writing a wrapper for the Cassandra cppdriver for CQL3.0 and I have come across some odd behavior, and I am not sure if it is typical or a bug.
For reference, I am working with with the cppdriver code release on 4 September (from the repository), libuv0.10 and off of the songs / playlist example posted on the datastax website (http://www.datastax.com/documentation/cql/3.1/cql/ddl/ddl_music_service_c.html)
The problem that I am having is with executing query strings. There seems to be some threshold of characters after which the query string being sent to Cassandra becomes garbage. The code that I am using to construct and send the string to the cppdriver library (and parse the results) is provided below. I added a function (cass_session_print_query) to the cassandra.h and session.cpp files to print out generated statement.
map<string, vector<string> > retresults;
int i = 0, ccount;
stringstream ss;
vector<string> keys = get.GetList();
vector<string>::iterator kit = keys.begin();
map<int, pair<string, string> > primkeys = get.GetMap();
map<int, pair<string, string> >::iterator mit = primkeys.begin();
if (!keys.empty())
{
ss << "SELECT " << (*kit);
++kit;
for ( ; kit != keys.end(); ++kit)
ss << "," << (*kit);
ss << " FROM " << tablename;
if (!primkeys.empty())
{
ss << " WHERE ";
ss << mit->second.first << " = ?";
++mit;
for ( ; mit != primkeys.end(); ++mit)
ss << " and " << mit->second.first << " = ?";
mit = primkeys.begin();
}
ss << ";";
cass_bool_t has_more_pages = cass_false;
const CassResult* result = NULL;
CassString query = cass_string_init(ss.str().c_str());
CassStatement* statement = cass_statement_new(query, primkeys.size());
for ( ; mit != primkeys.end(); ++mit)
cass_statement_bind_string(statement, i++, cass_string_init(mit->second.second.c_str()));
cass_statement_set_paging_size(statement, 100);
do
{
cass_session_print_query(statement);
CassIterator* iterator;
CassFuture* future = cass_session_execute(session_, statement);
if (cass_future_error_code(future) != 0)
{
CassString message = cass_future_error_message(future);
fprintf(stderr, "Error: %.*s\n", (int)message.length, message.data);
break;
}
result = cass_future_get_result(future);
ccount = cass_result_column_count(result);
vector<string> cnames;
for (i = 0; i < ccount; i++)
cnames.push_back(cass_result_column_name(result, i).data);
iterator = cass_iterator_from_result(result);
ListVector::iterator vit;
while (cass_iterator_next(iterator))
{
const CassRow* row = cass_iterator_get_row(iterator);
for (vit = cnames.begin(); vit != cnames.end(); ++vit)
{
CassString value;
char value_buffer[256];
cass_value_get_string(cass_row_get_column_by_name(row, (*vit).c_str()), &value);
if (value.length == 0 || value.data == NULL)
continue;
memcpy(value_buffer, value.data, value.length);
value_buffer[value.length] = '\0';
retresults[(*vit)].push_back(value_buffer);
}
}
has_more_pages = cass_result_has_more_pages(result);
if (has_more_pages)
cass_statement_set_paging_state(statement, result);
cass_iterator_free(iterator);
cass_result_free(result);
} while (has_more_pages);
}
return retresults;
With this, an initial query string of SELECT id,album,title,artist,data FROM songs; results in a Cassandra query string of SELECT id,album,title,artist,data FROM songs;. However, if I add one more column to the SELECT portion SELECT id,album,title,artist,data,tags FROM songs; the query string in the Cassandra cppdriver library becomes something like: ,ar����,dat�� jOM songX. This results in the following error from Cassandra / library: Error: line 1:49 no viable alternative at character '�'.
I have also tried fewer columns, but with a WHERE clause, and that results in the same problem.
Is this a bug? Or am I building and sending strings to the cppdriver library incorrectly?
You should cass_future_wait() on the execute future before testing the error code.
Unrelated: there are also a couple of things that should be freed (future, statement), but I'm assuming that was omitted to keep this concise.
So, it looks like (for whatever reason) I HAVE to parse out the row key from the results. I checked the example, and I was able to not parse out the row key information and everything still worked. I am not yet entirely sure what is forcing me to do this (compared to the provided paging example), but for others, you need to include the following within the while (cass_iterator_nex(iterator)) block to "magically" fix my code above.
CassUuid key;
char key_buffer[CASS_UUID_STRING_LENGTH];
const CassRow* row = cass_iterator_get_row(iterator);
cass_value_get_uuid(cass_row_get_column(row, 0), key);
cass_uuid_string(key, key_buffer);
This is really a long shot, but since you mentioned the Music Service example, did you possibly download and use cql_collections.zip query strings? If so, the strings (now fixed) had minor syntax errors:
-use music
-CREATE TABLE music.songs ( id uuid PRIMARY KEY, album text, artist text, data blob, reviews list, tags set, title text, venue map
+use music;
+CREATE TABLE music.songs ( id uuid PRIMARY KEY, album text, artist text, data blob, reviews list, tags set, title text, venue map);
AeroBuffalo's code worked for me except that I had to put '&' in front of the second parameter of cass_value_get_uuid() function. It required reference type.
cass_value_get_uuid(cass_row_get_column(row, 0), &key);