Azure C++ library: "Invalid streambuf object" - c++

I am trying to download a potentially huge Azure block blob, using the C++ Azure client library. It isn't working because I don't know how to initialize a concurrency::streams::streambuf object with a buffer size. My code looks like this:
// Assume blockBlob has been created correctly.
concurrency::streams::istream blobStream = blockBlob.open_read();
// I don't know how to initialize this streambuf:
concurrency::streams::streambuf<uint8_t> dlStreamBuf;
size_t nBytesReturned = 0, nBytesToRead = 65536;
do {
// This gets the exception "Invalid streambuf object":
concurrency::task<size_t> returnedTask = blobStream.read(dlStreamBuf, nBytesToRead);
nBytesReturned = returnedTask.get();
bytesSoFar += nBytesReturned;
// Process the data in dlStreamBuf here...
} while(nBytesReturned > 0);
blobStream.close();
Note that the above streambuf is not to be confused with a standard C++ streambuf.
Can anyone advise me on how to properly construct and initialize a concurrency::streams::streambuf?
Thanks.

streambuf seems to be a template class. Try this instead:
concurrency::streams::container_buffer<std::vector<uint8_t>> output_buffer;
size_t nBytesReturned = 0, nBytesToRead = 65536;
do {
// This gets the exception "Invalid streambuf object":
concurrency::task<size_t> returnedTask = stream.read(output_buffer, nBytesToRead);
nBytesReturned = returnedTask.get();
bytesSoFar += nBytesReturned;
// Process the data in dlStreamBuf here...
} while (nBytesReturned > 0);
stream.close();
Sample code is here: https://github.com/Azure/azure-storage-cpp/blob/76cb553249ede1e6f05456d936c9a36753cc1597/Microsoft.WindowsAzure.Storage/tests/blob_streams_test.cpp#L192

I haven't used the stream methods for C++, but there are two ways mentioned in the C++ documentation about downloading to files or to steams here
The download_to_stream method ex:
// Retrieve storage account from connection string.
azure::storage::cloud_storage_account storage_account = azure::storage::cloud_storage_account::parse(storage_connection_string);
// Create the blob client.
azure::storage::cloud_blob_client blob_client = storage_account.create_cloud_blob_client();
// Retrieve a reference to a previously created container.
azure::storage::cloud_blob_container container = blob_client.get_container_reference(U("my-sample-container"));
// Retrieve reference to a blob named "my-blob-1".
azure::storage::cloud_block_blob blockBlob = container.get_block_blob_reference(U("my-blob-1"));
// Save blob contents to a file.
concurrency::streams::container_buffer<std::vector<uint8_t>> buffer;
concurrency::streams::ostream output_stream(buffer);
blockBlob.download_to_stream(output_stream);
std::ofstream outfile("DownloadBlobFile.txt", std::ofstream::binary);
std::vector<unsigned char>& data = buffer.collection();
outfile.write((char *)&data[0], buffer.size());
outfile.close();
Alternative, using download_to_file:
// Retrieve storage account from connection string.
azure::storage::cloud_storage_account storage_account = azure::storage::cloud_storage_account::parse(storage_connection_string);
// Create the blob client.
azure::storage::cloud_blob_client blob_client = storage_account.create_cloud_blob_client();
// Retrieve a reference to a previously created container.
azure::storage::cloud_blob_container container = blob_client.get_container_reference(U("my-sample-container"));
// Retrieve reference to a blob named "my-blob-2".
azure::storage::cloud_block_blob text_blob = container.get_block_blob_reference(U("my-blob-2"));
// Download the contents of a blog as a text string.
utility::string_t text = text_blob.download_text();

Related

Read CSV from std::vector<unsigned char> using Apache Arrow

I am trying to read a csv input format using Apache arrow. The example here mentions that the input should be an InputStream, however in my case I just have an std::vector of unsigned chars. Is it possible to parse this using apache arrow? I have checked the I/O interface to see if there is an "in-memory" data structure with no luck.
I copy-paste the example code for convenience here as well as my input data:
#include "arrow/csv/api.h"
{
// ...
std::vector<unsigned char> data;
arrow::io::IOContext io_context = arrow::io::default_io_context();
// how can I fit the std::vector to the input stream?
std::shared_ptr<arrow::io::InputStream> input = ...;
auto read_options = arrow::csv::ReadOptions::Defaults();
auto parse_options = arrow::csv::ParseOptions::Defaults();
auto convert_options = arrow::csv::ConvertOptions::Defaults();
// Instantiate TableReader from input stream and options
auto maybe_reader =
arrow::csv::TableReader::Make(io_context,
input,
read_options,
parse_options,
convert_options);
if (!maybe_reader.ok()) {
// Handle TableReader instantiation error...
}
std::shared_ptr<arrow::csv::TableReader> reader = *maybe_reader;
// Read table from CSV file
auto maybe_table = reader->Read();
if (!maybe_table.ok()) {
// Handle CSV read error
// (for example a CSV syntax error or failed type conversion)
}
std::shared_ptr<arrow::Table> table = *maybe_table;
}
Any help would be appreciated!
The I/O interface docs list BufferReader which works as an in-memory input stream. While not listed in the docs, it can be constructed from a pointer and a size which should let you use your vector<char>.

protobuf C++ SQLite handle blob data

I have a SQLite database which has a table which contains some fields of BLOB type.
What I am trying to do is fetch the field (in fact all other fields too) from the database into C++ send it through protobuf and receive the protobuf .
I have defined the blob fields as bytes in the .proto file
For example
message fields{
...
bytes myBlobField = 1;
}
My c++ file contains
sqlite3_initialize();
rc = sqlite3_open_v2(db_url, &db,SQLITE_OPEN_READWRITE | SQLITE_OPEN_CREATE,NULL);
std::ostringstream oss;
oss << "select * from attribtable ";
std::string query = oss.str();
rc = sqlite3_prepare_v2(db,query.c_str(),-1,&stmt,NULL
while(sqlite3_step(stmt) == SQLITE_ROW){
sqlite3_column_blob(stmt,10) //This is the blob field
}
How do I store the sqlite3_column_blob(stmt,10) in C++ and how do I set myBlobField using
say reply->set_myblobfield(??)
and receive on the client side using
say receive->get_myblobfield()
So in simple words my question is how do I send the blobfield fetched from database, through protobuf, from server to client in a C++ application?
Using this .proto file
syntax = "proto2";
package prototest;
message fields{
required bytes myBlobField = 1;
}
You initialize the blob using the set_myblobfield() call with the blob pointer and the byte size of the blob which you get from SQLite and then call the SerializeToOstream() method to write it to a stream or to a file.
std::ofstream myoutput("myoutput.bin");
while (sqlite3_step(stmt) == SQLITE_ROW)
{
if (size_t blobSize = sqlite3_column_bytes(stmt, 10))
{
if (const void* blob = sqlite3_column_blob(stmt, 10))
{
prototest::fields myfields;
myfields.set_myblobfield(blob, blobSize);
myfields.SerializeToOstream(&myoutput);
}
}
}

Minio: Why is my PUT request failing, even though GET is successful?

I am using Minio to emulate S3 and test my code locally. My code is written using the AWS SDK for C++.
What I would like to do (for testing purposes) is to get an object from Minio, store it and then send the same object back to Minio using a PUT request. The PUT request fails with the error Unable to connect to endpoint. I am however able to use curl to PUT objects to Minio.
This is how I set up my S3Client (I added some likes to explain why I did stuff):
auto credentialsProvider = Aws::MakeShared<Aws::Auth::EnvironmentAWSCredentialsProvider>("someTag");
Aws::Client::ClientConfiguration config;
config.endpointOverride = Aws::String("172.17.0.2:9000");
config.scheme = Aws::Http::Scheme::HTTP;
// diable ssl https://github.com/aws/aws-sdk-cpp/issues/284
config.verifySSL = false;
// set region to default https://github.com/awslabs/amazon-kinesis-producer/issues/66
config.region = "us-east-1";
// disable virtual adress and signing https://stackoverflow.com/questions/47105289/how-to-override-endpoint-in-aws-sdk-cpp-to-connect-to-minio-server-at-localhost
Aws::S3::S3Client client(credentialsProvider, config, Aws::Client::AWSAuthV4Signer::PayloadSigningPolicy::Never, false);
This is how both my GET and my PUT request look like. GET works, PUT does not:
// declare request
Aws::S3::Model::GetObjectRequest get_obj_req;
get_obj_req.WithBucket("someBucket").WithKey("someKey");
// Get works fine
auto get_object_outcome = client.GetObject(get_obj_req);
if (!get_object_outcome.IsSuccess()){
// fail does not happen
}
// write file from Minio to local file (seems to work fine)
int size = 1024;
char buffer[size];
auto &retrieved_file = get_object_outcome.GetResultWithOwnership().GetBody().read(buffer, size);
std::ofstream out("someFile");
out << std::string(buffer);
out.close();
// try to PUT the stored file again
Aws::S3::Model::PutObjectRequest put_obj_req;
const std::shared_ptr<Aws::IOStream> input_data =
Aws::MakeShared<Aws::FStream>("tag", "someFile", std::ios_base::in | std::ios_base::binary);
put_obj_req.SetBody(input_data);
put_obj_req.WithBucket("someBucket").WithKey("someKey");
put_obj_req.SetContentLength(size);
put_obj_req.SetContentType("application/octet-stream");
// PUT request
auto resp = client.PutObject(put_obj_req);
if (!resp.IsSuccess()){
// fails here
}
As I already mentioned, I am able to PUT objects to Minio, using curl. You can have a look in this Gist.
Sidenote: I am using Minio inside a Docker container.
EDIT: I believe this might be a problem with the data that I want to PUT. If the data has the e.g. Content-Type application/octet-stream I run into an error, but I do not run into this error when using txt-files. My current code looks like this and I assume that the streaming breaks if I want to stream anything but chers. Can you confirm?
Aws::String content_tye = get_object_outcome.GetResult().GetContentType();
Aws::IOStream &retrieved_file = get_object_outcome.GetResultWithOwnership().GetBody();
retrieved_file.seekg(0, retrieved_file.end);
int retrieved_file_size = retrieved_file.tellg();
retrieved_file.seekg(0, retrieved_file.beg);
char *buffer = new char[retrieved_file_size];
retrieved_file.read(buffer, retrieved_file_size);
AWS_LOGSTREAM_INFO(TAG, "Retrieved file of size: " + Aws::Utils::StringUtils::to_string(retrieved_file_size));
Aws::StringStream stream(Aws::String(buffer));
const std::shared_ptr<Aws::IOStream> input_data =
Aws::MakeShared<Aws::StringStream>(TAG, Aws::String(buffer));

Editing docx with openxml returns invalid memorystream

I created a DLL that takes a Word template, I have code that edits the document using openXML then the result is sent via memory stream to a web service where the documents is downloaded to the user. The issue is that the memory stream is sending is either the original template document without the updates OR sends the updated Word document XML format where the document is obviously corrupted. Here is the code:
string strTemplate = AppDomain.CurrentDomain.BaseDirectory + "Report Template.docx";
WordprocessingDocument wdDocument;
//stream the template
byte[] fileBytes = File.ReadAllBytes(strTemplate);
MemoryStream memstreamDocument = new MemoryStream();
memstreamDocument.Write(fileBytes, 0, (int)fileBytes.Length);
wdDocument = WordprocessingDocument.Open(memstreamDocument, true);
//CODE TO UPDATE TEMPLATE
//Save entire document
wdDocument.MainDocumentPart.Document.Save();
After saving the document, if using the following code the memory stream returns the original template without any updates to the document:
return memstreamDocument;
If using the following code the memory stream returns the openXML data with the updates but the document is corrupted:
MemoryStream memstreamUpdatedDocument = new MemoryStream();
Stream streamDocument = wdDocument.MainDocumentPart.GetStream();
streamDocument.CopyTo(memstreamUpdatedDocument);
return memstreamUpdatedDocument;
Here is my code in the web service which works fine:
HttpResponse response = HttpContext.Current.Response;
MemoryStream stream = GR.GetReport("", intReportID, Culture, ConnectionString, false);
response.Clear();
response.ClearHeaders();
response.ClearContent();
response.AddHeader("content-disposition", "attachment; filename=\"" + "Report_" + intReportID+ ".docx\"");
response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document";
response.ContentEncoding = Encoding.GetEncoding("ISO-8859-1");
stream.Position = 0;
stream.CopyTo(response.OutputStream);
response.End();
return response;
After reviewing the supplied code I have provided a modified code snippet that should fit your needs of returning a modified MemoryStream from a file template using the WordprocessingDocument class by OpenXML. Your web service code snippet provided should work as is.
// file path of template
string strTemplate = AppDomain.CurrentDomain.BaseDirectory + "Report Template.docx";
// create FileStream to read from template
FileStream fsTemplate = new FileStream(strTemplate, FileMode.Open, FileAccess.Read);
// create MemoryStream to copy template into and modify as needed
MemoryStream msDocument = new MemoryStream();
// copy template FileStream into document MemoryStream
fsTemplate.CopyTo(msDocument);
// close the template FileStream as it is no longer necessary
fsTemplate.Close();
// reset cursor position of document MemoryStream back to top
// before modifying
msDocument.Position = 0;
// create WordProcessingDocument using the document MemoryStream
using (WordprocessingDocument wdDocument = WordprocessingDocument.Open(msDocument, true)) {
//Access the main Workbook part, which contains all references.
MainDocumentPart mainPart = wdDocument.MainDocumentPart;
/* ... CODE TO UPDATE TEMPLATE ... */
// save modification to main document part
wdDocument.MainDocumentPart.Document.Save();
// close wdDocument as it is no longer needed
wdDocument.Close();
}
// reset cursor position of document MemoryStream back to top
msDocument.Position = 0;
// return memory stream as promised
return msDocument;

How to create new GMimeMessage from string?

In my project i use libgmime for MIME types. I'm trying to create new GMimeMessage using std::string as a body.
According to docs it can be done using GMimeStream and GMimeDataWrapper for preparing data, and then creating GMimePart from this data to be set as MIME part of new message.
The code:
std::string body = "some test data";
GMimeMessage* message = g_mime_message_new(FALSE);
//set header
g_mime_object_set_header((GMimeObject *) message, name.c_str()), value.c_str();
//create stream and write data into it.
GMimeStream* stream;
g_mime_stream_construct(stream, 0, body.length());
g_mime_stream_write_string(stream, body.c_str());
GMimeDataWrapper* wrapper = g_mime_data_wrapper_new_with_stream(stream, GMIME_CONTENT_ENCODING_DEFAULT);
//create GMimePart to be set as mime part of GMimeMessage
GMimePart* mime_part = g_mime_part_new();
g_mime_part_set_content_object(mime_part, wrapper);
g_mime_message_set_mime_part(message, (GMimeObject *) mime_part);
When i try to create message in this way, i get segfault here:
g_mime_stream_write_string(stream, body.c_str());
Maybe i'm using wrong method of message creation...
What's the right way it can be done?
You have bad initialization GMimeStream *stream. Need:
GMimeStream *stream;
/* initialize GMime */
g_mime_init (0);
/* create a stream around stdout */
stream = g_mime_stream_mem_new_with_buffer(body_part.c_str(), body_part.length());
See doc: http://spruce.sourceforge.net/gmime/tutorial/x49.html
And sample: http://fossies.org/linux/gmime/examples/basic-example.c