gRPC and etcd client - c++

This question involves etcd specific stuff, but I think the question more related to work with gRPC in general.
I'm trying to create etcd Watch for some keys, since the documentation is sparse I had a look at Nokia implementation
It was easy to adapt code to my needs and I came up with first version which worked just fine, creating WatchCreateRequest, and firing callback on key update. So far so good. Then I've tried to add more than one key to watch. Fiasco! ClientAsyncReaderWriter is failing to Read/Write in such a case. Now to the question.
If I have following members in my class
Watch::Stub watchStub;
CompletionQueue completionQueue;
ClientContext context;
std::unique_ptr<ClientAsyncReaderWriter<WatchRequest, WatchResponse>> stream;
WatchResponse reply;
and I want to support multiple Watches added to my class, I guess I have to hold several variables per watch and not as class members.
First of all, I guess, WatchResponse reply should be one per Watch. I'm less sure about the stream, should I hold one per Watch? I'm almost sure that context could be reused for all Watches and 100% sure the stub and completionQueue can be reused for all Watches.
So the question is my guess-work right? Whats about thread safety? Didnt find any documentation describing what objects are safe to use from multiple thread and where I have to synchronize access.
Any link to documentation (not this one) will be appreciated!
Test code before I split members into single Watch property
(no proper shutdown, I know)
using namespace grpc;
class Watcher
{
public:
using Callback = std::function<void(const std::string&, const std::string&)>;
Watcher(std::shared_ptr<Channel> channel) : watchStub(channel)
{
stream = watchStub.AsyncWatch(&context, &completionQueue, (void*) "create");
eventPoller = std::thread([this]() { WaitForEvent(); });
}
void AddWatch(const std::string& key, Callback callback)
{
AddWatch(key, callback, false);
}
void AddWatches(const std::string& key, Callback callback)
{
AddWatch(key, callback, true);
}
private:
void AddWatch(const std::string& key, Callback callback, bool isRecursive)
{
auto insertionResult = callbacks.emplace(key, callback);
if (!insertionResult.second) {
throw std::runtime_error("Event handle already exist.");
}
WatchRequest watch_req;
WatchCreateRequest watch_create_req;
watch_create_req.set_key(key);
if (isRecursive) {
watch_create_req.set_range_end(key + "\xFF");
}
watch_req.mutable_create_request()->CopyFrom(watch_create_req);
stream->Write(watch_req, (void*) insertionResult.first->first.c_str());
stream->Read(&reply, (void*) insertionResult.first->first.c_str());
}
void WaitForEvent()
{
void* got_tag;
bool ok = false;
while (completionQueue.Next(&got_tag, &ok)) {
if (ok == false) {
break;
}
if (got_tag == (void*) "writes done") {
// Signal shutdown
}
else if (got_tag == (void*) "create") {
}
else if (got_tag == (void*) "write") {
}
else {
auto tag = std::string(reinterpret_cast<char*>(got_tag));
auto findIt = callbacks.find(tag);
if (findIt == callbacks.end()) {
throw std::runtime_error("Key \"" + tag + "\"not found");
}
if (reply.events_size()) {
ParseResponse(findIt->second);
}
stream->Read(&reply, got_tag);
}
}
}
void ParseResponse(Callback& callback)
{
for (int i = 0; i < reply.events_size(); ++i) {
auto event = reply.events(i);
auto key = event.kv().key();
callback(event.kv().key(), event.kv().value());
}
}
Watch::Stub watchStub;
CompletionQueue completionQueue;
ClientContext context;
std::unique_ptr<ClientAsyncReaderWriter<WatchRequest, WatchResponse>> stream;
WatchResponse reply;
std::unordered_map<std::string, Callback> callbacks;
std::thread eventPoller;
};

I'm sorry that I'm not very sure about the proper Watch design here. It's not very clear to me whether you want to create a gRPC call for each Watch.
Anyway, each gRPC call will have its own ClientContext, ClientAsyncReaderWriter. But stub and CompletionQueue is not per-call thing.
As far as I know, there is no central place to find the thread-safe classes. You may want to read the API document to have a correct expectation.
When I was writing the async server load reporting service, the only place I add synchronization myself is around CompletionQueue, so that I don't en-queue new tags to the cq if it's shut down.

Related

How to properly manage messages sent to a thread in C++

In my Android app, I use C++ to do some work. In my C++ code, I use a thread to do some tasks. Using this example and this example, here is how I proceed (I simplified the actual code to keep it easy to read):
std::thread* threadLocal;
std::queue<ThreadMessage*> queueLocale;
std::mutex mutexLocal;
std::condition_variable cvLocal;
struct ThreadMessage
{
ThreadMessage(int i)
{
id = i;
}
int id;
};
void MyWorkerThread::createThread()
{
if (!threadLocal)
{
threadLocal = new std::thread(&MyWorkerThread::process, this);
}
}
void MyWorkerThread::sendTask1()
{
if (threadLocal)
{
// message:
ThreadMessage* threadMessage = new ThreadMessage(MSG_TASK_1);
// send the message:
std::unique_lock<std::mutex> lock(mutexLocal);
queueLocale.push(std::move(threadMessage));
cvLocal.notify_one();
}
}
void MyWorkerThread::sendTask2()
{
if (threadLocal)
{
// message:
ThreadMessage* threadMessage = new ThreadMessage(MSG_TASK_2);
// send the message:
std::unique_lock<std::mutex> lock(mutexLocal);
queueLocale.push(std::move(threadMessage));
cvLocal.notify_one();
}
}
void MyWorkerThread::process()
{
while (1)
{
// init :
ThreadMessage* threadMessage = 0;
// waiting for messages :
{
std::unique_lock<std::mutex> lock(mutexLocal);
while (queueLocale.empty())
{
cvLocal.wait(lock);
}
threadMessage = std::move(queueLocale.front());
queueLocale.pop();
}
// tasks :
switch (threadMessage->id)
{
case MSG_TASK_1:
{
doSomeWork1();
delete threadMessage;
break;
}
case MSG_TASK_2:
{
doSomeWork2();
delete threadMessage;
break;
}
default:
{
delete threadMessage;
break;
}
}
}
}
It works well in most cases, but sometimes, my app crashes when a delete threadMessage is called, and I don't understand why (since I don't see how it can be called twice on a same object).
Here are the reasons why I need to send messages to a thread, instead of just creating new threads each time I want to run doSomeWork1() or doSomeWork2():
The doSomeWork1() and doSomeWork2() functions have to be executed in the same thread
One of those functions is very frequently called (approx. 200 times / sec), so I don't want to create a thread each time
So my question is: what is the proper way to send a message to a thread, and manage it inside the thread, to avoid error on the delete?
Thanks for your help.

protected data structure elements passed by value across functions

I have a network application where multiple users have their userContext saved in the server. Users continuously send different requests to the server and the server processes them asynchronously using a thread pool and an event loop (like epoll for example).
All the users have their own ID. This id is unique with respect to the server. The server stores the user context in a map<int,userContext>. When the server receives a message from one user, it decodes the message. It also finds what is the request and userID etc. Then the server processes the request and it also updates (as required, state machine updates in the userContext current state) the stored userContext in the map.
In my application, there is a lot of such procedure calls (few are nested also) and I am passing the user context by value. I can not pass the reference of the map values. (then protection no longer exists).
Below is a sample implementation of userContextStore and two procedure.
class userContext {
public:
int id;
int value1;
int value2;
userContext():id(-1),value1(-1),value2(-1){}
userContext(const userContext &context) {
this->id = context.id;
this->value1 = context.value1;
this->value2 = context.value2;
}
};
class contextStore {
public:
map<int,userContext> Map;
std::mutex m;
void update(userContext context,int id) {
lock_guard<std::mutex> lock(m);
if(Map.find(id) != Map.end()) {
Map[id] = context;
return;
}
Map[id] = context;
}
userContext getUserContext(int id) {
lock_guard<std::mutex> lock(m);
userContext context(Map[id]);
return context;
}
void printContext(int id) {
lock_guard<std::mutex> lock(m);
if(Map.find(id) != Map.end()) {
userContext temp(Map[id]);
cout << temp.value1 << endl;
cout << temp.value2 << endl;
}
}
};
void procedureA(contextStore &store,userContext context) {
// do some long processing using the provided context
// change context copy in between
// based on the above processing
context.value1 += 20; // example of a change
int id = context.id;
store.update(context,id);
}
void procedureB(contextStore &store,int id) {
userContext context(store.getUserContext(id));
// do some other long processing
// change context copy in between
context.value1 -= 10; // example of a change
store.update(context,id);
}
Is there any better (possibly way to avoid object copy multiple times) to pass the userContext objects for the required modification inside a particular procedure call?
The second issue is that I am using a fat lock protecting the entire map. In this case, if one user request is under process in the server, then other user request's will not be processed. (because of the lock protects the entire map). Is there any way to have to fine grain lock for such situation?
Thanks!

How to determine that type_name of a deleted DataWriter using RTI DDS

I'm writing a tool in c++ using RTI DDS 5.2 that needs to detect when DataWriters are deleted and know the type_name of the related data. I'm using code similar to this and this.
I'm using a DDSWaitSet and it is getting triggered when a DataWriter is deleted with delete_datawriter but the SampleInfo indicates that the data is not valid and sure enough, the data sample type_name is empty.
Is there a way to delete a DataWriter in such a was as to cause the built in topic subscription to get the type_name? Or is there a QOS setting I can set to fix this behavior?
Found a workaround for this problem. I still don't know how to make it so the data sample is valid when a DataWriter goes away, but the SampleInfo does have the field instance_state which uniquely identifies the writer. The solution is to keep track of type_names when the data is valid and just look it up when it's not. Here is the gist of the code I am using to solve the issue:
struct cmp_instance_handle {
bool operator()(const DDS_InstanceHandle_t& a, const DDS_InstanceHandle_t& b) const {
return !DDS_InstanceHandle_equals(&a, &b);
}
};
void wait_for_data_writer_samples()
{
ConditionSeq cs;
DDSWaitSet* ws = new DDSWaitSet();
StatusCondition* condition = _publication_dr->get_statuscondition();
DDS_Duration_t timeout = {DDS_DURATION_INFINITE_SEC, DDS_DURATION_INFINITE_NSEC};
std::map<DDS_InstanceHandle_t, std::string, cmp_instance_handle> instance_handle_map;
ws->attach_condition(condition);
condition->set_enabled_statuses(DDS_STATUS_MASK_ALL);
while(true) {
ws->wait(cs, timeout);
PublicationBuiltinTopicDataSeq data_seq;
SampleInfoSeq info_seq;
_publication_dr->take(
data_seq,
info_seq,
DDS_LENGTH_UNLIMITED,
ANY_SAMPLE_STATE,
ANY_VIEW_STATE,
ANY_INSTANCE_STATE
);
int len = data_seq.length();
for(int i = 0; i < len; ++i) {
DDS_InstanceHandle_t instance_handle = info_seq[i].instance_handle;
if(info_seq[i].valid_data) {
std::string type_name(data_seq[i].type_name);
// store the type_name in the map for future use
instance_handle_map[instance_handle] = type_name;
if(info_seq[i].instance_state == DDS_InstanceStateKind::DDS_ALIVE_INSTANCE_STATE) {
do_data_writer_alive_callback(type_name);
}
else {
// If the data is valid, but not DDS_ALIVE_INSTANCE_STATE, the DataWriter died *and* we can
// directly access the type_name so we can handle that case here
do_data_writer_dead_callback(type_name);
}
}
else {
// If the data is not valid then the DataWriter is definitely not alive but we can't directly access
// the type_name. Fortunately we can look it up in our map.
do_data_writer_dead_callback(instance_handle_map[instance_handle]);
// at this point the DataWriter is gone so we remove it from the map.
instance_handle_map.erase(instance_handle);
}
}
_publication_dr->return_loan(data_seq, info_seq);
}
}

RxCpp: how to control subject observer's lifetime when used with buffer_with_time

The purpose of the following code is to have various classes publish data to an observable. Some classes will observe every data, some will observe periodically with buffer_with_time().
This works well until the program exits, then it crashes, probably because the observer using buffer_with_time() is still hanging on to some thread.
struct Data
{
Data() : _subscriber(_subject.get_subscriber()) { }
~Data() { _subscriber.on_completed(); }
void publish(std::string data) { _subscriber.on_next(data); }
rxcpp::observable<std::string> observable() { return _subject.get_observable(); }
private:
rxcpp::subjects::subject<std::string> _subject;
rxcpp::subscriber<std::string> _subscriber;
};
void foo()
{
Data data;
auto period = std::chrono::milliseconds(30);
auto s1 = data.observable()
.buffer_with_time(period , rxcpp::observe_on_new_thread())
.subscribe([](std::vector<std::string>& data)
{ std::cout << data.size() << std::endl; });
data.publish("test 1");
data.publish("test 2");
std::this_thread::sleep_for(std::chrono::milliseconds(100));
// hope to call something here so s1's thread can be joined.
// program crashes upon exit
}
I tried calling "s1.unsubscribe()", and various as_blocking(), from(), merge(), but still can't get the program to exit gracefully.
Note that I used "subjects" here because "publish" can then be called from different places (which can be from different threads). I am not sure if this is the best mechanism to do that, I am open to other ways to accomplish that.
Advice?
This is very close to working..
However, having the Data destructor complete the input while also wanting the subscription to block the exit of foo until input is completed makes this more complex.
Here is a way to ensure that foo blocks after Data destructs. This is using the existing Data contract.
void foo1()
{
rxcpp::observable<std::vector<std::string>> buffered;
{
Data data;
auto period = std::chrono::milliseconds(30);
buffered = data.observable()
.buffer_with_time(period , rxcpp::observe_on_new_thread())
.publish().ref_count();
buffered
.subscribe([](const std::vector<std::string>& data)
{ printf("%lu\n", data.size()); },
[](){printf("data complete\n");});
data.publish("test 1");
data.publish("test 2");
// hope to call something here so s1's thread can be joined.
// program crashes upon exit
}
buffered.as_blocking().subscribe();
printf("exit foo1\n");
}
Alternatively, the changing the shape of Data (add a complete method) would allow the following code:
struct Data
{
Data() : _subscriber(_subject.get_subscriber()) { }
~Data() { complete(); }
void publish(std::string data) { _subscriber.on_next(data); }
void complete() {_subscriber.on_completed();}
rxcpp::observable<std::string> observable() { return _subject.get_observable(); }
private:
rxcpp::subjects::subject<std::string> _subject;
rxcpp::subscriber<std::string> _subscriber;
};
void foo2()
{
printf("foo2\n");
Data data;
auto newthread = rxcpp::observe_on_new_thread();
auto period = std::chrono::milliseconds(30);
auto buffered = data.observable()
.buffer_with_time(period , newthread)
.tap([](const std::vector<std::string>& data)
{ printf("%lu\n", data.size()); },
[](){printf("data complete\n");});
auto emitter = rxcpp::sources::timer(std::chrono::milliseconds(0), newthread)
.tap([&](long) {
data.publish("test 1");
data.publish("test 2");
data.complete();
});
// hope to call something here so s1's thread can be joined.
// program crashes upon exit
buffered.combine_latest(newthread, emitter).as_blocking().subscribe();
printf("exit foo2\n");
}
I think that this better expresses the dependencies..

WaitOne() timeout parameter lost?

why System.Threading.WaitHandle.WaitOne() hasn't overload for the timeout parameters as available in standard .NET implementation: http://msdn.microsoft.com/en-us/library/cc189907(v=vs.110).aspx
It's very useful in working threads when, during thread sleep, thread is requested to stop from Main UI thread.. Other ways to implement it?
Example:
public void StartBatteryAnimation()
{
whStopThread = new ManualResetEvent(false);
batteryAnimationThread = new Thread(new ThreadStart(BatteryAnimation_Callback));
batteryAnimationThread.Start();
}
public void StopBatteryAnimation()
{
whStopThread.Set();
batteryAnimationThread.Join();
batteryAnimationThread = null;
whStopThread.Dispose();
whStopThread = null;
}
public void BatteryAnimation_Callback()
{
bool exitResult = false;
while (true)
{
// Do some stuff
exitResult = whStopThread.WaitOne(WAIT_INTERVALL);
if (exitResult) break;
}
}
Thanks Frank for your (1000th!!) reply.
So my custom implementation for WaitHandle.WaitOne(int Timeout) has been:
private Thread batteryAnimationThread = null;
private Semaphore batteryAnimationSemaphore = null;
public void StartBatteryAnimation()
{
batteryAnimationSemaphore = new Semaphore(1);
batteryAnimationSemaphore.Acquire();
batteryAnimationThread = new Thread(new ThreadStart(BatteryAnimation_Callback));
batteryAnimationThread.Start();
}
public void StopBatteryAnimation()
{
batteryAnimationSemaphore.Release();
batteryAnimationThread.Join();
batteryAnimationThread = null;
batteryAnimationSemaphore = null;
}
public void BatteryAnimation_Callback()
{
bool stopThread = false;
try
{
while (true)
{
// Do some stuff..
stopThread = batteryAnimationSemaphore.TryAcquire(1, BATTERY_ANIMATION_INTERVALL, Java.Util.Concurrent.TimeUnit.MILLISECONDS);
if (stopThread) break;
}
catch (Exception ex)
{
}
batteryAnimationSemaphore.Release();
}
Is this the right way for?
Thanks
This one hasn't been implemented yet. You may use semaphore.tryAcquire instead.
Background: Due to its design, dot42 supports the entire Android API (C# classes are proxies and generated from the android.jar). But is supports only part of the .NET API because the .NET classes are handcrafted on top of the Android/Java API.
Related question: Java Equivalent of .NET's ManualResetEvent and WaitHandle
UPDATE
We released the API under Apache License 2.0 so anyone may contribute now. I also logged an issue.