gRPC: How to set/reset deadline timer on successful write by ClientWriter - c++

I am using grpc bidirectional streaming connection to upload images on remote server. I am splitting the given image into smaller chunks of 1MB and then perform upload operation. The uploading is working fine for the images around 4MB size but if size is > 10MB then the given deadline timer timeout and it fails always. Currently I have tested only for < 4MB and > 10MB size images.
Here is my upload function
bool FileUploadClient::upload_file(const std::string& upload_file_name)
{
bool upload_result = false;
if(m_grpc_fileupload_stub)
{
ClientContext m_context;
// set deadline timer
std::chrono::system_clock::time_point deadline = std::chrono::system_clock::now() + std::chrono::seconds(10);
m_context.set_deadline(deadline);
FileUploadGrpc::FileUploadResponse file_upload_res;
std::vector<FileUploadGrpc::FileUploadRequest> upload_request_vect;
// split file contents into chunks as file is big in size
read_file_into_chunks(upload_file_name, upload_request_vect);
std::unique_ptr<::grpc::ClientWriter<::FileUploadGrpc::FileUploadRequest>> writer(
m_grpc_fileupload_stub->UploadFileToServer(&m_context, &file_upload_res));
// Send file contents chunk by chunk to grpc end
for(const auto& upload_req_msg : upload_request_vect)
{
if(!writer->Write(upload_req_msg))
{
break; // if write fails then break
}
else
{
// I want to reset deadline timer
}
}
writer->WritesDone();
Status status = writer->Finish();
upload_result = (status.ok() && (file_upload_res.status_code() == FileUploadGrpc::FileUploadStatusCode::OK));
if(!upload_result)
{
LOG_ERROR(log_prefix << ", grpc file upload failed for: " << upload_file_name);
}
}
return upload_result;
}
As I understood that in the above give deadline timer the RPC call to upload images < 4MB completes successfully but for 10MB size I have to increase the deadline timer suitable for 10MB and that is not good idea.
So regardless of image size how can I set/reset the deadline timer after every successful write operation. Or is there any better way to do that?

Related

strange logging timestamp with chrono::sleep_until

I am testing application's latency during UDP communication on windows 10.
I tried to send a message every 1 second and receive a response sent immediately from the remote.
Send thread
It works every 1 second.
auto start = std::chrono::system_clock::now();
unsigned int count = 1;
while (destroyFlag.load(std::memory_order_acquire) == false)
{
if (isReady() == false)
{
break;
}
/*to do*/
worker_();
std::this_thread::sleep_until(start + std::chrono::milliseconds(interval_)* count++);
}
worker_()
Send thread call this. just send message and make log string.
socket_.send(address_);
logger_.log("," + std::string("Send") + "\n");
Receiver
When message arrives, it creates a receive log string and flushes it to a file.
auto& queueData = socket_.getQueue();
while (queueData.size() > 0)
{
auto str = queueData.dequeue();
logger_.log(",Receive" + str + "\n");
logger_.flush();
}
I've been testing it overnight and I can't figure out why I got this result.
chart for microseconds
x-axis : Hour_Minute_second
y-axis : microseconds
For a few hours it seemed to work as expected. But after that, the time gradually changed and went to a different time zone.
Does anyone know why this is happening?
std::chrono::steady_clock is working.
It made my charts straight.
And another way, turn off the windows automatically time synchronize.

Problem with esp8266 sending large JSON Document via MQTT

I developed a little application that read data from a sensor, store them in SPIFFS memory of my wemos D1 mini (esp8266) and then create a JSON Document and send it via MQTT to my topic. The problem is that as long as I send a JSON Doc with 10 object everything works great, but when I increase the size of the doc over 10 object nothing works. Eventually I need to send a JSON doc with 100 object inside.
What have I already done?
I'm using PubSubClient and I already set the MAX_PACKET_SIZE to the correct value
Using arduinojson assistant I found out the size of my JSON Document (8192 bytes)
I tried to use mqtt.fx to test if the problem was the esp8266 or the mqtt broker. Using mqtt.fx I'm able to send a JSON doc with 100 objects
As soon as I increase the size of the JSON doc I get a wdt error from the serial monitor of my arduino IDE.
I search the internet for wdt error but I don't get what they are and how to solve my problem
Last things I already tried to show on the serial monitor the file.txt in the SPIFFS where I store the data and I can store and then read the 100 object
So in the end I think it's an esp8266 problem and not PubSubClient or MQTT. Am I right?
Does anyone of you here ever encountered this problem before or have some other test I can run?
I search the internet for wdt error but I don't get what they are and how to solve my problem
WDT stands for a Watch Dog Timer. https://os.mbed.com/cookbook/WatchDog-Timer#:~:text=A%20watchdog%20timer%20(WDT)%20is,a%20software%20or%20hardware%20fault.
A watchdog timer (WDT) is a hardware timer that automatically generates a system reset if the main program neglects to periodically service it. It is often used to automatically reset an embedded device that hangs because of a software or hardware fault. Some systems may also refer to it as a computer operating properly (COP) timer. Many microcontrollers including the mbed processor have watchdog timer hardware.
Let's paint a better picture with an example. Let's say that you setup a WDT with a time of 10 seconds. Then the WDT starts counting down from 10 seconds. If it reaches 0 the processor will reset. "Feeding" the WDT will reset the countdown to the original value in this case 10 seconds. So if the WDT has counted down to 4 seconds remaining and you feed it, it resets the countdown back to 10 and starts counting down again.
Does anyone of you here ever encountered this problem before or have some other test I can run?
It looks to me like sending a larger JSON object takes a longer period of time than what the WDT is set for. One possibility would be to break up the JSON object into multiple pieces and send it in smaller chunks instead of one large one. This way the time between WDT "feedings" is reduced. I have no idea if this would be possible for you to change. But this should at least give you a better idea of what's happening.
OK in the end the problem was that sending a large JsonDocument triggered the WDT and the only way I found to overcome this problem was, as suggested by adamvz, to create a main file with all the 100 object, then call a function to split that file in 10 smaller one and send each of them over the internet through an HTTP request or Mosquitto.
Supposing you already created the main file in the spiffs memory, then:
This to split the main file:
void WritePacks() {
sourceFile = LittleFS.open("/file.txt", "r");
if (!sourceFile) {
Serial.println(F("Error: file.txt open failed"));
} else {
Serial.println("File open w/ success");
for (byte idx = 0; idx < outputCount; idx++) {
String aLine;
aLine.reserve(capacity);
if (sourceFile.available() == 0) break;
destinationFile = LittleFS.open(outputFileNames[idx], "w");
if (!destinationFile) {
Serial.print(F("can't open destination "));
Serial.println(outputFileNames[idx]);
break;
} else {
int lineCount = 0;
while (sourceFile.available() && (lineCount <= 10)) {
aLine = sourceFile.readStringUntil('\n');
destinationFile.println(aLine); // double check if the '\n' is in the String or not (--> print or println accordingly)
lineCount++;
}
outputIndex = idx;
Serial.println(outputIndex);
destinationFile.close();
}
} // end for
sourceFile.close();
}
}//end WritePacks
This to publish:
//------ HTTP Publish ------
void httpPublish(){
const char * outputFileNames[] = {"/out1.txt", "/out2.txt", "/out3.txt", "/out4.txt", "/out5.txt", "/out6.txt", "/out7.txt", "/out8.txt", "/out9.txt", "/out10.txt"};
const byte outputCount = sizeof outputFileNames / sizeof outputFileNames[0];
byte outputIndex = 0;
File sourceFile;
File destinationFile;
//Serial.println(capacity);
for (byte idx = 0; idx < outputCount; idx++) {
DynamicJsonDocument doc(capacity);
DynamicJsonDocument globalDoc(capacity);
StaticJsonDocument <1024> localDoc;
String aLine;
aLine.reserve(capacity);
destinationFile = LittleFS.open(outputFileNames[idx], "r");
if (!destinationFile) {
Serial.print(F("can't open destination "));
Serial.println(outputFileNames[idx]);
break;
} else {
Serial.print("Reading: ");
Serial.println(outputFileNames[idx]);
//int lineCount = 0;
while (destinationFile.available()) {
aLine = destinationFile.readStringUntil('\n');
DeserializationError error = deserializeJson(localDoc, aLine);
if (!error) globalDoc.add(localDoc);
else{ Serial.println("Error Writing All files");}
}//while
JsonObject Info = doc.createNestedObject("Info");
Info["Battery"] = battery;
Info["ID"] = id;
Info["Latitudine"] = latitudine;
Info["Longitudine"] = longitudine;
JsonArray Data = doc.createNestedArray("Data");
Data.add(globalDoc);
HTTPClient http;
//Send request
http.begin("yourURL");
char buffer[capacity];
size_t n = serializeJson(doc, buffer);
http.POST(buffer);
Serial.println(buffer);
http.end();
destinationFile.close();
}
}// end for
}//end httpPublish

gRPC: What are the best practices for long-running streaming?

We've implemented a Java gRPC service that runs in the cloud, with an unidirectional (client to server) streaming RPC which looks like:
rpc PushUpdates(stream Update) returns (Ack);
A C++ client (a mobile device) calls this rpc as soon as it boots up, to continuously send an update every 30 or so seconds, perpetually as long as the device is up and running.
ChannelArguments chan_args;
// this will be secure channel eventually
auto channel_p = CreateCustomChannel(remote_addr, InsecureChannelCredentials(), chan_args);
auto stub_p = DialTcc::NewStub(channel_p);
// ...
Ack ack;
auto strm_ctxt_p = make_unique<ClientContext>();
auto strm_p = stub_p->PushUpdates(strm_ctxt_p.get(), &ack);
// ...
While(true) {
// wait until we are ready to send a new update
Update updt;
// populate updt;
if(!strm_p->Write(updt)) {
// stream is not kosher, create a new one and restart
break;
}
}
Now different kinds of network interruptions happen while this is happening:
the gRPC service running in the cloud may go down (for maintenance) or may simply become unreachable.
the device's own ip address keeps changing as it is a mobile device.
We've seen that on such events, neither the channel, nor the Write() API is able to detect network disconnection reliably. At times the client keep calling Write() (which doesn't return false) but the server doesn't receive any data (wireshark doesn't show any activity at the outgoing port of the client device).
What are the best practices to recover in such cases, so that the server starts receiving the updates within X seconds from the time when such an event occurs? It is understandable that there would loss of X seconds worth data whenever such an event happens, but we want to recover reliably within X seconds.
gRPC version: 1.30.2, Client: C++-14/Linux, Sever: Java/Linux
Here's how we've hacked this. I want to check if this can be made any better or anyone from gRPC can guide me about a better solution.
The protobuf for our service looks like this. It has an RPC for pinging the service, which is used frequently to test connectivity.
// Message used in IsAlive RPC
message Empty {}
// Acknowledgement sent by the service for updates received
message UpdateAck {}
// Messages streamed to the service by the client
message Update {
...
...
}
service GrpcService {
// for checking if we're able to connect
rpc Ping(Empty) returns (Empty);
// streaming RPC for pushing updates by client
rpc PushUpdate(stream Update) returns (UpdateAck);
}
Here is how the c++ client looks, which does the following:
Connect():
Create the stub for calling the RPCs, if the stub is nullptr.
Call Ping() in regular intervals until it is successful.
On success call PushUpdate(...) RPC to create a new stream.
On failure reset the stream to nullptr.
Stream(): Do the following a while(true) loop:
Get the update to be pushed.
Call Write(...) on the stream with the update to be pushed.
If Write(...) fails for any reason break and the control goes back to Connect().
Once in every 30 minutes (or some regular interval), reset everything (stub, channel, stream) to nullptr to start afresh. This is required because at times Write(...) does not fail even if there is no connection between the client and the service. Write(...) calls are successful but the outgoing port on the client does not show any activity on wireshark!
Here is the code:
constexpr GRPC_TIMEOUT_S = 10;
constexpr RESTART_INTERVAL_M = 15;
constexpr GRPC_KEEPALIVE_TIME_MS = 10000;
string root_ca, tls_key, tls_cert; // for SSL
string remote_addr = "https://remote.com:5445";
...
...
void ResetStreaming() {
if (stub_p) {
if (strm_p) { // graceful restart/stop, this pair of API are called together, in this order
if (!strm_p->WritesDone()) {
// Log a message
}
strm_p->Finish(); // Log if return value of this is NOT grpc::OK
}
strm_p = nullptr;
strm_ctxt_p = nullptr;
stub_p = nullptr;
channel_p = nullptr;
}
}
void CreateStub() {
if (!stub_p) {
ChannelArguments chan_args;
chan_args.SetInt(GRPC_ARG_KEEPALIVE_TIME_MS, GRPC_KEEPALIVE_TIME_MS);
channel_p = CreateCustomChannel(
remote_addr,
SslCredentials(SslCredentialsOptions{root_ca, tls_key, tls_cert}),
chan_args);
stub_p = GrpcService::NewStub(m_channel_p);
}
}
void Stream() {
const auto restart_time = steady_clock::now() + minutes(RESTART_INTERVAL_M);
while (!stop) {
// restart every RESTART_INTERVAL_M (15m) even if ALL IS WELL!!
if (steady_clock::now() > restart_time) {
break;
}
Update updt = GetUpdate(); // get the update to be sent
if (!stop) {
if (channel_p->GetState(true) == GRPC_CHANNEL_SHUTDOWN ||
!strm_p->Write(updt)) {
// could not write!!
return; // we will Connect() again
}
}
}
// stopped due to stop = true or interval to create new stream has expired
ResetStreaming(); // channel, stub, stream are recreated once in every 15m
}
bool PingRemote() {
ClientContext ctxt;
ctxt.set_deadline(system_clock::now() + seconds(GRPC_TIMEOUT_S));
Empty req, resp;
CreateStub();
if (stub_p->Ping(&ctxt, req, &resp).ok()) {
static UpdateAck ack;
strm_ctxt_p = make_unique<ClientContext>(); // need new context
strm_p = stub_p->PushUpdate(strm_ctxt_p.get(), &ack);
return true;
}
if (strm_p) {
strm_p = nullptr;
strm_ctxt_p = nullptr;
}
return false;
}
void Connect() {
while (!stop) {
if (PingRemote() || stop) {
break;
}
sleep_for(seconds(5)); // wait before retrying
}
}
// set to true from another thread when we want to stop
atomic<bool> stop = false;
void StreamUntilStopped() {
if (stop) {
return;
}
strm_thread_p = make_unique<thread>([&] {
while (!stop) {
Connect();
Stream();
}
});
}
// called by the thread that sets stop = true
void Finish() {
strm_thread_p->join();
}
With this we are seeing that the streaming recovers within 15 minutes (or RESTART_INTERVAL_M) whenever there is a disruption for any reason. This code runs in a fast path, so I am curious to know if this can be made any better.

Can't find bug in my threading/atomic values code

I am using CodeBlocks with MinGW compiler and wxWidgets library.
I am writing a program that read some data from the microcontroller, by sending messages (using index and subindex) and getting response messages with said data.
My plan was to send messages one-by-one and waiting for response message using __atomic int variables__ to check when I get response message.
This is my function for sending a message:
typedef std::chrono::high_resolution_clock Clock;
void sendSDO(int index, int subindex)
{
int nSent = 0;
atomic_index.store(index);
atomic_subindex.store(subindex);
canOpenClient->SDORead(index, subindex);
auto start = Clock::now();
nSentMessages++;
nSent++;
Sleep(10);
while ((atomic_index.load() != 0) && (atomic_subindex.load() != 0))
{
auto t = chrono::duration_cast<chrono::milliseconds>(Clock::now() - start);
if(t.count() > 20)
{
if (nSent > 5)
{
MainFrame->printTxt("[LOG] response not received\n");
return;
}
atomic_index.store(index);
atomic_subindex.store(subindex);
canOpenClient->SDORead(index, subindex);
nSentMessages++;
nSent++;
start = Clock::now();
}
}
}
Pseudocode is set atomic int to index and subindex of what I want value I want to read from microcontroller, then send message to it SDORead(), and if no response was received in 20 ms, send the message again, up to 5 times.
For receiving messages, I have a __separate thread__ with a callback function which is called when I get response message from the controller:
void notifyEvent(unsigned char ev_type)
{
SDO_msg_t msg;
msg = canOpenClient->Cmd_CustomMessageGet(); //get response message
if(ev_type == CO_EVENT_SDO_READ)
{
if ((msg.index == atomic_index.load()) && (msg.subindex == atomic_subindex.load()))
{
//does stuff, like saves message data to set container
atomic_index.store(0);
atomic_subindex.store(0);
}
}
if (message data not in container)
printf("not in container!")
}
Here I set the same atomic int values to 0, when the correct response message is received, and save response message data
I also have variables nSentMessages and nReceivedMessages, which hold the number of messages sent and messages received. I check at the end if these values are the same. Normally, I wouldn't need this (since I wait for every response), I put it there as an extra safety measure.
Now onto the problem:
1) My problem is in callback function notifyEvent(), where I presumably save response message data to a container, but I still sometimes get "not in container!" from that if statement and I don't know why. (My container is just normal set set<EDSobject, cmp> container, it's not atomic or anything, since I know there won't be reads/writes to it at the same time from different threads.)
2) If you check my function sendSDO(), there is a line Sleep(10). The program works ok with it, but if I remove it, the program returns a different value for nSentMessages and nReceivedMessages - 576 and 575. This happens every time I run the program and I don't understand why.

Linux poll on serial transmission end

I'm implementing RS485 on arm developement board using serial port and gpio for data enable.
I'm setting data enable to high before sending and I want it to be set low after transmission is complete.
It can be simply done by writing:
//fd = open("/dev/ttyO2", ...);
DataEnable.Set(true);
write(fd, data, datalen);
tcdrain(fd); //Wait until all data is sent
DataEnable.Set(false);
I wanted to change from blocking-mode to non-blocking and use poll with fd. But I dont see any poll event corresponding to 'transmission complete'.
How can I get notified when all data has been sent?
System: linux
Language: c++
Board: BeagleBone Black
I don't think it's possible. You'll either have to run tcdrain in another thread and have it notify the the main thread, or use timeout on poll and poll to see if the output has been drained.
You can use the TIOCOUTQ ioctl to get the number of bytes in the output buffer and tune the timeout according to baud rate. That should reduce the amount of polling you need to do to just once or twice. Something like:
enum { writing, draining, idle } write_state;
while(1) {
int write_event, timeout = -1;
...
if (write_state == writing) {
poll_fds[poll_len].fd = write_fd;
poll_fds[poll_len].event = POLLOUT;
write_event = poll_len++
} else if (write == draining) {
int outq;
ioctl(write_fd, TIOCOUTQ, &outq);
if (outq == 0) {
DataEnable.Set(false);
write_state = idle;
} else {
// 10 bits per byte, 1000 millisecond in a second
timeout = outq * 10 * 1000 / baud_rate;
if (timeout < 1) {
timeout = 1;
}
}
}
int r = poll(poll_fds, poll_len, timeout);
...
if (write_state == writing && r > 0 && (poll_fds[write_event].revent & POLLOUT)) {
DataEnable.Set(true); // Gets set even if already set.
int n = write(write_fd, write_data, write_datalen);
write_data += n;
write_datalen -= n;
if (write_datalen == 0) {
state = draining;
}
}
}
Stale thread, but I have been working on RS-485 with a 16550-compatible UART under Linux and find
tcdrain works - but it adds a delay of 10 to 20 msec. Seems to be polled
The value returned by TIOCOUTQ seems to count bytes in the OS buffer, but NOT bytes in the UART FIFO, so it may underestimate the delay required if transmission has already started.
I am currently using CLOCK_MONOTONIC to timestamp each send, calculating when the send should be complete, when checking that time against the next send, delaying if necessary. Sucks, but seems to work