Multipart Upload S3 using AWS C++ SDK - c++

I am trying to upload a file to S3 using multipart upload feature in AWS C++ SDK. I could find examples for JAVA, .NET, PHP, RUBY and Rest API, but didnt find any lead on how to do it in C++. Can you please provide me a direction to achieve the same.

transfer manager needs the file to be stored on disk and have a filename. That's not good for streaming. Here's the code template that I used to do prototyping.
#include <aws/core/Aws.h>
#include <aws/s3/S3Client.h>
#include <aws/s3/model/CreateMultipartUploadRequest.h>
#include <aws/core/utils/HashingUtils.h>
#include <aws/s3/model/CompleteMultipartUploadRequest.h>
#include <aws/s3/model/GetObjectRequest.h>
#include <aws/s3/model/UploadPartRequest.h>
#include <sstream>
#include <iostream>
#include <string>
using namespace Aws::S3::Model;
int main(int argc, char *argv[]) {
// new api
Aws::SDKOptions options;
Aws::InitAPI(options);
// use default credential provider chains
Aws::Client::ClientConfiguration clientConfiguration;
clientConfiguration.region = "<your-region>";
clientConfiguration.endpointOverride = "<endpoint-override>";
Aws::S3::S3Client s3_client(clientConfiguration);
std::string bucket = "<bucket>";
std::string key = "<key>";
// initiate the process
Aws::S3::Model::CreateMultipartUploadRequest create_request;
create_request.SetBucket(bucket.c_str());
create_request.SetKey(key.c_str());
create_request.SetContentType("text/plain");
auto createMultipartUploadOutcome =
s3_client.CreateMultipartUpload(create_request);
std::string upload_id = createMultipartUploadOutcome.GetResult().GetUploadId();
std::cout << "multiplarts upload id is:" << upload_id << std::endl;
// start upload
Aws::S3::Model::UploadPartRequest my_request;
my_request.SetBucket(bucket.c_str());
my_request.SetKey(key.c_str());
my_request.SetPartNumber(1);
my_request.SetUploadId(upload_id.c_str());
Aws::StringStream ss;
// just have a small chunk of data to verify everything works
ss << "to upload";
std::shared_ptr<Aws::StringStream> stream_ptr =
Aws::MakeShared<Aws::StringStream>("WriteStream::Upload" /* log id */, ss.str());
my_request.SetBody(stream_ptr);
Aws::Utils::ByteBuffer part_md5(
Aws::Utils::HashingUtils::CalculateMD5(*stream_ptr));
my_request.SetContentMD5(Aws::Utils::HashingUtils::Base64Encode(part_md5));
auto start_pos = stream_ptr->tellg();
stream_ptr->seekg(0LL, stream_ptr->end);
my_request.SetContentLength(static_cast<long>(stream_ptr->tellg()));
stream_ptr->seekg(start_pos);
auto uploadPartOutcomeCallable1 = s3_client.UploadPartCallable(my_request);
// finish upload
Aws::S3::Model::CompleteMultipartUploadRequest completeMultipartUploadRequest;
completeMultipartUploadRequest.SetBucket(bucket.c_str());
completeMultipartUploadRequest.SetKey(key.c_str());
completeMultipartUploadRequest.SetUploadId(upload_id.c_str());
UploadPartOutcome uploadPartOutcome1 = uploadPartOutcomeCallable1.get();
CompletedPart completedPart1;
completedPart1.SetPartNumber(1);
auto etag = uploadPartOutcome1.GetResult().GetETag();
// if etag must not be empty
assert(etag.empty());
completedPart1.SetETag(etag);
completeMultipartUploadRequest.SetBucket(bucket.c_str());
completeMultipartUploadRequest.SetKey(key.c_str());
completeMultipartUploadRequest.SetUploadId(upload_id.c_str());
CompletedMultipartUpload completedMultipartUpload;
completedMultipartUpload.AddParts(completedPart1);
completeMultipartUploadRequest.WithMultipartUpload(completedMultipartUpload);
auto completeMultipartUploadOutcome =
s3_client.CompleteMultipartUpload(completeMultipartUploadRequest);
if (!completeMultipartUploadOutcome.IsSuccess()) {
auto error = completeMultipartUploadOutcome.GetError();
std::stringstream ss;
ss << error << error.GetExceptionName() << ": " << error.GetMessage() << std::endl;
return -1;
}
}

I recommend using Transfer Manager in general.
But if you don’t want to for whatever reason, then you look at source of transfer manager to see how to do multipart upload using S3 APIs directly.

Related

How to play with spdlog?

I downloaded and followed the example 1.
Moved to example 2 (Create stdout/stderr logger object) and got stuck. Actually I can run it as it is but if I change
spdlog::get("console") to spdlog::get("err_logger") it crashes.
Am I supposed to change it like that?
#include "spdlog/spdlog.h"
#include "spdlog/sinks/stdout_color_sinks.h"
void stdout_example()
{
// create color multi threaded logger
auto console = spdlog::stdout_color_mt("console");
auto err_logger = spdlog::stderr_color_mt("stderr");
spdlog::get("err_logger")->info("loggers can be retrieved from a global registry using the spdlog::get(logger_name)");
}
int main()
{
stdout_example();
return 0;
}
I also tried Basic file logger example:
#include <iostream>
#include "spdlog/sinks/basic_file_sink.h"
void basic_logfile_example()
{
try
{
auto logger = spdlog::basic_logger_mt("basic_logger", "logs/basic-log.txt");
}
catch (const spdlog::spdlog_ex &ex)
{
std::cout << "Log init failed: " << ex.what() << std::endl;
}
}
int main()
{
basic_logfile_example();
return 0;
}
And I see it creates basic-log.txt file but there is nothing on it.
Because you need to register err_logger logger first. There is no default err_logger as far as I know. spdlog::get() returns logger based on its registered name, not variable.
You need a code like this. Code is complex and you may not need all of it though:
#include "spdlog/sinks/stdout_color_sinks.h"
#include "spdlog/sinks/rotating_file_sink.h"
void multi_sink_example2()
{
spdlog::init_thread_pool(8192, 1);
auto stdout_sink = std::make_shared<spdlog::sinks::stdout_color_sink_mt >();
auto rotating_sink = std::make_shared<spdlog::sinks::rotating_file_sink_mt>("mylog.txt", 1024*1024*10, 3);
std::vector<spdlog::sink_ptr> sinks {stdout_sink, rotating_sink};
auto logger = std::make_shared<spdlog::async_logger>("err_logger", sinks.begin(), sinks.end(), spdlog::thread_pool(), spdlog::async_overflow_policy::block);
spdlog::register_logger(logger); //<-- this line registers logger for spdlog::get
}
and after this code, you can use spdlog::get("err_logger").
You can read about creating and registering loggers here.
I think spdlog::stderr_color_mt("stderr"); registers logger with name stderr so spdlog::get("stderr") may work, but have not tested myself.

How do I write a minimal GraphQL query of the Axie server?

I'm writing a program to download data about Axies and process them. My plan is to download all the marketplace, getting just the index numbers, then download details about Axies. Before getting all the details about an Axie, I'd like to get just one detail. I've succeeded in making an HTTPS connection to the server and sending a query, but all it replies is "Bad Request".
I've been using Shane Maglangit's site https://axie-graphql.web.app/ for examples, but the examples are too big for me to understand, since I don't know GraphQL or JSON, and part of the queries has literal \n and the other part has linefeeds, which is confusing me. His code is in JavaScript, which I don't know, so I don't know if JS is doing something different with \n than C++ does.
Here's my code:
main.cpp
#include <iostream>
#include <iomanip>
#include <boost/program_options.hpp>
#include "http.h"
using namespace std;
int main(int argc,char **argv)
{
string query="{\n \"operationName\": \"GetAxieDetail\",\n"
" \"variables\":\n {\n \"axieId\": \"5144540\"\n },\n"
" \"query\": \"query GetAxieDetail($axieId: ID!) {\\n ...AxieDetail\\n __typename}\n}"
"fragment AxieDetail on Axie{axie(axieId: $axieId)}\"";
string response;
string urlv2="https://axieinfinity.com/graphql-server-v2/graphql";
string urlv1="https://graphql-gateway.axieinfinity.com/graphql";
response=httpPost(urlv1,query);
cout<<response<<endl;
return 0;
}
http.h
#include <string>
std::string httpPost(std::string url,std::string data);
http.cpp
#include <boost/asio.hpp>
#include <boost/asio/ssl.hpp>
#include <boost/beast.hpp>
#include <boost/beast/ssl.hpp>
#include <boost/asio/ssl/error.hpp>
#include <boost/asio/ssl/stream.hpp>
#include <array>
#include <iostream>
namespace beast=boost::beast;
namespace http=beast::http;
namespace net=boost::asio;
namespace ssl=net::ssl;
using tcp=net::ip::tcp;
using namespace std;
array<string,4> parseUrl(string url)
// protocol, hostname, port, path. All are strings, including the port.
{
size_t pos0=url.find("://");
size_t pos1;
array<string,4> ret;
ret[0]=url.substr(0,pos0);
if (pos0<url.length())
pos0+=3;
pos1=url.find("/",pos0);
ret[1]=url.substr(pos0,pos1-pos0);
ret[3]=url.substr(pos1);
pos0=ret[1].find(":");
if (pos0<ret[1].length())
{
ret[2]=ret[1].substr(pos0+1);
ret[1]=ret[1].substr(0,pos0);
}
else
if (ret[0]=="https")
ret[2]="443";
else if (ret[0]=="https")
ret[2]="80";
else
ret[2]="0";
return ret;
}
string httpPost(string url,string data)
{
net::io_context context;
ssl::context ctx(ssl::context::tlsv12_client);
tcp::resolver res(context);
tcp::resolver::results_type endpoints;
beast::ssl_stream<beast::tcp_stream> stream(context,ctx);
array<string,4> parsed=parseUrl(url);
http::request<http::string_body> req;
http::response<http::string_body> resp;
beast::flat_buffer buffer;
//load_root_certificates(ctx);
ctx.set_verify_mode(ssl::verify_peer);
endpoints=res.resolve(parsed[1],parsed[2]);
beast::get_lowest_layer(stream).connect(endpoints);
SSL_set_tlsext_host_name(stream.native_handle(),parsed[1].c_str());
if (parsed[0]=="https")
stream.handshake(net::ssl::stream_base::client);
req.method(http::verb::post);
req.target(parsed[3]);
req.set(http::field::host,parsed[1]);
req.set(http::field::user_agent,BOOST_BEAST_VERSION_STRING);
req.set(http::field::content_type,"application/json");
req.set(http::field::accept,"application/json");
req.body()=data;
req.prepare_payload();
http::write(stream,req);
http::read(stream,buffer,resp);
cout<<parsed[0]<<"|\n"<<parsed[1]<<"|\n"<<parsed[2]<<"|\n"<<parsed[3]<<"|\n";
cout<<data<<"|\n";
return resp.body();
}
How can I write a query that returns one detail of the Axie with the specified number? Which of the two Axie servers should I use, and what's the difference?
Here is a working query string:
string query="{\n"
" \"operationName\": \"GetAxieDetail\",\n"
" \"variables\":\n"
" {\n"
" \"axieId\": \"5144540\"\n"
" },\n"
" \"query\":\n"
" \"query GetAxieDetail($axieId: ID!)"
" {\\n"
" axie(axieId: $axieId)\\n"
" {\\n"
" class\\n"
" }\\n"
" }\"\n"
"}\n";
The response is:
{"data":{"axie":{"class":"Plant"}}}
The server insists on no line feeds in the quoted query string, but allows \n; the \n, though, makes no difference, as the response is just one line.

Upload an image file takes too long on AWS S3 c++ SDK

Using AWS S3 C++ SDK for uploading .jpg images to a certain IAM user introduce huge time delays that in any case are caused due to network traffic and latency issues. I am using free-tier S3 version and MSVC 2017 64bit for my application (on Windows 10 PC). Here is a sample code:
Aws::SDKOptions options;
Aws::InitAPI(options);
Aws::Client::ClientConfiguration config;
config.region = Aws::Region::US_EAST_2;
Aws::S3::S3Client s3_client(Aws::Auth::AWSCredentials(KEY,ACCESS_KEY), config);
const Aws::String bucket_name = BUCKET;
const Aws::String object_name = "image.jpg";
Aws::S3::Model::PutObjectRequest put_object_request;
put_object_request.SetBucket(bucket_name);
put_object_request.SetKey(object_name);
std::shared_ptr<Aws::IOStream> input_data =
Aws::MakeShared<Aws::FStream>("PutObjectInputStream",
"../image.jpg",
std::ios_base::in | std::ios::binary);
put_object_request.SetBody(input_data);
put_object_request.SetContentType("image/jpeg");
input_data->seekg(0LL, input_data->end);
put_object_request.SetContentLength(static_cast<long>(input_data->tellg()));
auto put_object_outcome = s3_client.PutObject(put_object_request);
When I upload images bigger than 100KB the total
PutObject(put_object_request);
time of execution exceeds 2min for a 520KB image.
I have tried the same example using Python boto3 and the total upload time for the same image is around 25s.
Have anyone faced the same issue?
After a better look to AWS github repo I figure out the issue.
The problem was that WinHttpSyncHttpClient was making timouts and reset upload activity internally thus not exiting the upload thread and finally aborting the transaction. By adding a custom timeout value the problem solved.
I used multipart Upload to re-implement the example as it seems more robust and manageable. Although I thought that it is unavailable for C++ SDK, its not the case as the TransferManager does the same job for C++ (not by using S3 headers as Java, .NET and PHP does).
Thanks KaibaLopez and SCalwas from AWS github repo who help me solve the issues (issue1, issue2). I am pasting an example code is case anyone face the same issue:
#include "pch.h"
#include <iostream>
#include <fstream>
#include <filesystem>
#include <aws/core/Aws.h>
#include <aws/core/auth/AWSCredentials.h>
#include <aws/s3/S3Client.h>
#include <aws/s3/model/Bucket.h>
#include <aws/transfer/TransferManager.h>
#include <aws/transfer/TransferHandle.h>
static const char* KEY = "KEY";
static const char* BUCKET = "BUCKET_NAME";
static const char* ACCESS_KEY = "AKEY";
static const char* OBJ_NAME = "img.jpg";
static const char* const ALLOCATION_TAG = "S3_SINGLE_OBJ_TEST";
int main()
{
Aws::SDKOptions options;
Aws::InitAPI(options);
{
Aws::Client::ClientConfiguration config;
config.region = Aws::Region::US_EAST_2;
config.requestTimeoutMs = 20000;
auto s3_client = std::make_shared<Aws::S3::S3Client>(Aws::Auth::AWSCredentials(KEY, ACCESS_KEY), config);
const Aws::String bucket_name = BUCKET;
const Aws::String object_name = OBJ_NAME;
const Aws::String key_name = OBJ_NAME;
auto s3_client_executor = Aws::MakeShared<Aws::Utils::Threading::DefaultExecutor>(ALLOCATION_TAG);
Aws::Transfer::TransferManagerConfiguration trConfig(s3_client_executor.get());
trConfig.s3Client = s3_client;
trConfig.uploadProgressCallback =
[](const Aws::Transfer::TransferManager*, const std::shared_ptr<const Aws::Transfer::TransferHandle>&transferHandle)
{ std::cout << "Upload Progress: " << transferHandle->GetBytesTransferred() <<
" of " << transferHandle->GetBytesTotalSize() << " bytes" << std::endl;};
std::cout << "File start upload" << std::endl;
auto tranfer_manager = Aws::Transfer::TransferManager::Create(trConfig);
auto transferHandle = tranfer_manager->UploadFile(object_name.c_str(),
bucket_name.c_str(), key_name.c_str(), "multipart/form-data", Aws::Map<Aws::String, Aws::String>());
transferHandle->WaitUntilFinished();
if(transferHandle->GetStatus() == Aws::Transfer::TransferStatus::COMPLETED)
std::cout << "File up" << std::endl;
else
std::cout << "Error uploading: " << transferHandle->GetLastError() << std::endl;
}
Aws::ShutdownAPI(options);
return 0;
}

I invoke LSCopyApplicationURLsForURL() using C++, but get a segment fault

I want to write a C++ program to get associated applications which are suitable to open specified file. I find the LSCopyApplicationURLsForURL API, and create a command line C++ application by XCode.
But after running this program, I always get segment fault. XCode shows EXEC_BAD_ACCESS(code=1, address....) error.
I also tryied running it from sudo, but the same result. What is the problem?
The code:
#include <iostream>
#include <objc/objc.h>
#include <objc/objc-runtime.h>
#include <CoreFoundation/CoreFoundation.h>
#include <CoreServices/CoreServices.h>
using namespace std;
int main(int argc, const char * argv[]) {
auto url = CFURLRef("file:///Users/efan/src/a.cpp");
auto ret = LSCopyApplicationURLsForURL(url, kLSRolesAll);
cout << ret << endl;
return 0;
}
Try creating your CFURLRef using one of the proper CFURLCreate* methods. See "Creating a CFURL" here.
For example:
auto tempStringURL = CFStringCreateWithCString(nullptr, "/Users/efan/src/a.cpp", kCFStringEncodingUTF8);
auto url = CFURLCreateWithFileSystemPath(nullptr, tempStringURL, kCFURLPOSIXPathStyle, FALSE);
auto ret = LSCopyApplicationURLsForURL(url, kLSRolesAll);
You need to Release the "Created" variables to clean up memory.

c++ Debug Assertion Failed on HTTP Request

I'm doing some code where i need to do a GET request and manipulate the info received. For this i'm using C++ REST SDK (codename "Casablanca") for the request
This is my code
#include <cpprest/http_client.h>
#include <cpprest/filestream.h>
using namespace utility;
using namespace web;
using namespace web::http;
using namespace web::http::client;
using namespace concurrency::streams;
//This method i saw on the Microsoft documentation
pplx::task<void> HTTPStreamingAsync()
{
http_client client(L"http://localhost:10000/Something"); //The api is running at the moment
// Make the request and asynchronously process the response.
return client.request(methods::GET).then([](http_response response)
{
// Print the status code.
std::wostringstream ss;
ss << L"Server returned returned status code " << response.status_code() << L'.' << std::endl;
std::wcout << ss.str();
// TODO: Perform actions here reading from the response stream.
auto bodyStream = response.body();
// In this example, we print the length of the response to the console.
ss.str(std::wstring());
ss << L"Content length is " << response.headers().content_length() << L" bytes." << std::endl;
std::wcout << ss.str();
});
}
void main(int argc, char **argv)
{
HTTPStreamingAsync().wait();
//...
}
And when i use debug i get error on the following line:
return client.request(methods::GET).then([](http_response response)
With debug i see that variable "client" has content, but i still receive this error:
Image with the Error Message
I google it the error, and most of the people say that it is error on the code (trying to access some parts of the memory)...
Any ideas?
This issue can happen when the cpprestsdk DLL is build with Multi-Threaded DLL /MD and the calling library is build with Multi-Threaded /MT. Since the cpprestsdk does not offer a configuration for a .lib file, you are forced to use /MD. At least that is best to my knowledge, as I haven't been able to compile cpprestsdk.lib out of the box without a bunch of linker errors.