Google Cloud IoT Few config updates mesages are missing when sending config updates frequently from cloud functions to device - google-cloud-platform

I am using config update and cloud functions for communication between mobile application and esp32 device by following the example here, but when I am sending config update messages frequently some of them are not sending; say out of 5 only 3 config update messages are going, I have two questions:
1) How frequently we can send config update to avoid some missing updates.
2) Is there any alternative way to communicate between cloud functions and IoT device.

According to the docs: [IoT docs]
Configuration updates are limited to 1 update per second, per device.
However, for best results, device configuration should be updated much
less often — at most, once every 10 seconds.
The update rate is calculated as the time between the most recent
server acknowledgment and the next update request.
If your operations are mostly configuration updates I cannot think another alternative that could perform better.

Related

How to do "live request batching" in gcloud

Here is my situation:
I have a rather slow tensorflow model that runs on GPU (2 to 3 seconds per prediction)
A prediction for a single 'entity' vs a prediction for 8 'entities' takes about the same time
This means I could be 8 times as efficient by simply combining multiple predictions in the same request
I have a service on AI platform serving requests to that model
The service works for slow request rates but has trouble scaling up (anything over 4 QPS is too much to handle)
My question then is:
Is there a standard way / best practice for batching live client requests:
When receiving a request, wait a little bit for other requests
After a while, or when the number of requests reaches a set number, forward the requests in a single "batch" to another service.
If traffic is low, the delay will expire before the batch is full, but since traffic is low, that's not an issue
If traffic is high, the batch will be full before the delay, and the client will have to wait less
I have an almost-working solution with app-engine + firebase (for hosting the shared 'queue') but implementing the delay is giving me trouble (app engine doesn't seem to like python's threading.Timer
I'd appreciate something that could work with app engine, but at this point I'm open to any suggestions (as long as it is applicable on google cloud).
Thanks!
The perfect (but not the cheapest) is to use Dataflow.
When a prediction request comes in, publish it in PubSub
Deploy a dataflow in streaming mode, with fixed windows of X minutes, and another trigger, not accumulated, after Y event in the window.
When a window trigger is performed (either on the number of messages or on the timer) do the batch processing
You can imagine other designs, simpler/cheaper.
Still publish the prediction requests in PubSub
You can schedule a Cloud Functions, or a Cloud Run every X minutes to pull the pubsub subscription and then to trigger the batch job. But, it's a fixed time.
When you publish the message in PubSub, you can also store, in firestore for example, and increase a counter and the date of the 1st message published in PubSub.
If the number of message is above your threshold, perform a request to your other process that pull the PubSub subscription and run the batch processing (as before #1). Reset the counter value and the message date value
Set up a cloud scheduler which check, every minute, the value of the 1st message date in Firestore. If it's above your time limit, perform a request to your other process that pull the PubSub subscription and run the batch processing (as before #1). Reset the counter value and the message date value
The #2 will generate a lot of Firestore read/write, but will be cheaper than dataflow.

How to use Google Cloud PubSub and Run to handle resource-intensive long-running tasks?

I've got a Google Cloud PubSub topic which at times has thousands of messages and at times zero messages coming in. These messages represent tasks which can take upwards of an hour each. Preferably I'm able to use Cloud Run for this, as it scales really well to the demand, if a thousand messages gets published, I want 100s of Cloud Run instances to spin up. These Run instances get started by a push subscription. The problem is that PubSub has a 600 second timeout for the acknowledgement. This means in order to have Cloud Run process these messages they have to finish within 600 seconds. If they do not, PubSub times it out, and sends it again, causing the task to be restarted until the first task finally does acknowledge it (this causes the same task to be ran many times). Cloud Run acknowledges the messages by returning a 2** HTTP status code. The documentation states
When an application running on Cloud Run finishes handling a request, the container instance's access to CPU will be disabled or severely limited. Therefore, you should not start background threads or routines that run outside the scope of the request handlers.
So is it maybe possible to acknowledge a PubSub request through code and continue the processing, without having Google Cloud Run hand over the resources? Or is there a better solution I'm unaware of?
Because these processes are so code/resource-intensive, I feel Cloud Functions will not suffice. I've looked at https://cloud.google.com/solutions/using-cloud-pub-sub-long-running-tasks and https://cloud.google.com/blog/products/gcp/how-google-cloud-pubsub-supports-long-running-workloads. But these didn't answer my question.
I've looked at Google Cloud Tasks, which might be something? But the rest of the project has been built around PubSub/Run/Functions, so preferably I stick with that.
This project is written in Python.
So preferably I would like to write my Google Cloud Run tasks like this:
#app.route('/', methods=['POST'])
def index():
"""Endpoint for Google Cloud PubSub messages"""
pubsub_message = request.get_json()
logger.info(f'Received PubSub pubsub_message {pubsub_message}')
if message_incorrect(pubsub_message):
return "Invalid request", 400 #use normal NACK handling
# acknowledge message here without returning
# ...
# Do actual processing of the task here
# ...
So how can or should I solve this, so that the the resource-intensive tasks get properly scaled on demand ( so a push PubSub subscription ). And the tasks only get executed once.
Answers:
In short what has been answered. Cloud Run and Functions are just not suited for this problem. There is no way to have them do tasks that take longer than 9 or 15 minutes respectively. The only solution is to switch over to another Google Service and use a pull style subscription and lose out on auto-scaling of GC Run/Functions
Cloud Run on GKE can handle long process, more CPU and memory than available on managed platform. However, you have a GKE cluster always running and you loose the "pay-as-you-use" benefit.
If you want to use this solution, don't link directly PubSub push subscription to your Cloud Run on GKE. Use Cloud Task with HTTP job for this. The timeout is longer than PubSub (up to 24h instead of 10 min) and the retry policies are customizables.
Neither Cloud Functions nor Cloud Run is sufficient for arbitrarily long running operations. Cloud Functions has a hard cap of 9 minutes per invocation, and Cloud Run caps at 60. If you need more time, you're going to have to delegate the work to another product, such as Google Compute Engine. It should be possible to kick off some Compute Engine work from one of the serverless products.
Give the limits of pubsub acks, you'll probably have to find a way for a client to be able to poll or listen to some resource to find out when the work is actually done. You could use a database for that, and Cloud Firestore lets you listen to documents to find out when they change. So you could use that to track the status of your long-running work.

WSO2 API Manager 2.1 : Gateway not enforcing Throttling Limits

We have deployed API-M 2.1 in a distributed way (each component, GW, TM, KM are running in their own Docker image) on top on DC/OS 1.9 ( Mesos ).
We have issues to get the gateway to enforce throttling policies (should it be subscription tiers or app-level policies). Here is what we have managed to define so far:
The Traffic Manager itself does it job : it receives the event streams, analyzes them on the fly and pushes an event onto the JMS topic throttledata
The Gateway reads the message properly.
So basically we have discarded a communication issue.
However we found two potential issues:
In the event which is pushed to the TM component, the value of the appTenant is null (instead of carbon.super)- We have a single tenant defined.
When the gateway receives the throttling message, it decides to let the message go thinking the "stopOnQuotaReach" is set to false, when it is set to true (we checked the value in the database).
Digging into the source code, we related those two issues to a single source: the value for both values above are read from the authContext and apparently incorrectly set. We are stuck and running out of ideas of things to try and would need some pointers to what could be a potential source of the problem and things to check.
Can somebody help please ?
Thanks- Isabelle.
Is there two TM with HA enabled available in the system?
If the TM is HA enabled, how gateways publish data to TM. Is it load balanced data publishing or failover data publishing to the TMs?
Did you follow below articles to configure the environment with respect to your deployment?
http://wso2.com/library/articles/2016/10/article-scalable-traffic-manager-deployment-patterns-for-wso2-api-manager-part-1/
http://wso2.com/library/articles/2016/10/article-scalable-traffic-manager-deployment-patterns-for-wso2-api-manager-part-2/
Is throttling completely not working in your environment?
Have you noticed any JMS connection related logs in gateways nodes?
In these tests, we have disabled HA to avoid possible complications. Neither subscription nor app throttling policies are working, both because parameters that should have values have not the adequate value (appTenant, stopOnQuotaReach).
Our scenario is far more basic. If we go with one instance of each component, it fails as Isabelle described. And the only thing we know is that both parameters come from the Authentication Context.
Thank you!

AWS API Gateway Cache - Multiple service hits with burst of calls

I am working on a mobile app that will broadcast a push message to hundreds of thousands of devices at a time. When each user opens their app from the push message, the app will hit our API for data. The API resource will be identical for each user of this push.
Now let's assume that all 500,000 users open their app at the same time. API Gateway will get 500,000 identical calls.
Because all 500,000 nearly concurrent requests are asking for the same data, I want to cache it. But keep in mind that it takes about 2 seconds to compute the requested value.
What I want to happen
I want API Gateway to see that the data is not in the cache, let the first call through to my backend service while the other requests are held in queue, populate the cache from the first call, and then respond to the other 499,999 requests using the cached data.
What is (seems to be) happening
API Gateway, seeing that there is no cached value, is sending every one of the 500,000 requests to the backend service! So I will be recomputing the value with some complex db query way more times than resources will allow. This happens because the last call comes into API Gateway before the first call has populated the cache.
Is there any way I can get this behavior?
I know that based on my example that perhaps I could prime the cache by invoking the API call myself just before broadcasting the bulk push job, but the actual use-case is slightly more complicated than my simplified example. But rest assured, solving this simplified use-case will solve what I am trying to do.
If you anticipate that kind of burst concurrency, priming the cache yourself is certainly the best option. Have you also considered adding throttling to the stage/method to protect your backend from a large surge in traffic? Clients could be instructed to retry on throttles and they would eventually get a response.
I'll bring your feedback and proposed solution to the team and put it on our backlog.

What are the possible use cases for Amazon SQS or any Queue Service?

So I have been trying to get my hands on Amazon's AWS since my company's whole infrastructure is based of it.
One component I have never been able to understand properly is the Queue Service, I have searched Google quite a bit but I haven't been able to get a satisfactory answer. I think a Cron job and Queue Service are quite similar somewhat, correct me if I am wrong.
So what exactly SQS does? As far as I understand, it stores simple messages to be used by other components in AWS to do tasks & you can send messages to do that.
In this question, Can someone explain to me what Amazon Web Services components are used in a normal web service?; the answer mentioned they used SQS to queue tasks they want performed asynchronously. Why not just give a message back to the user & do the processing later on? Why wait for SQS to do its stuff?
Also, let's just say I have a web app which allows user to schedule some daily tasks, how would SQS would fit in that?
No, cron and SQS are not similar. One (cron) schedules jobs while the other (SQS) stores messages. Queues are used to decouple message producers from message consumers. This is one way to architect for scale and reliability.
Let's say you've built a mobile voting app for a popular TV show and 5 to 25 million viewers are all voting at the same time (at the end of each performance). How are you going to handle that many votes in such a short space of time (say, 15 seconds)? You could build a significant web server tier and database back-end that could handle millions of messages per second but that would be expensive, you'd have to pre-provision for maximum expected workload, and it would not be resilient (for example to database failure or throttling). If few people voted then you're overpaying for infrastructure; if voting went crazy then votes could be lost.
A better solution would use some queuing mechanism that decoupled the voting apps from your service where the vote queue was highly scalable so it could happily absorb 10 messages/sec or 10 million messages/sec. Then you would have an application tier pulling messages from that queue as fast as possible to tally the votes.
One thing I would add to #jarmod's excellent and succinct answer is that the size of the messages does matter. For example in AWS, the maximum size is just 256 KB unless you use the Extended Client Library, which increases the max to 2 GB. But note that it uses S3 as a temporary storage.
In RabbitMQ the practical limit is around 100 KB. There is no hard-coded limit in RabbitMQ, but the system simply stalls more or less often. From personal experience, RabbitMQ can handle a steady stream of around 1 MB messages for about 1 - 2 hours non-stop, but then it will start to behave erratically, often becoming a zombie and you'll need to restart the process.
SQS is a great way to decouple services, especially when there is a lot of heavy-duty, batch-oriented processing required.
For example, let's say you have a service where people upload photos from their mobile devices. Once the photos are uploaded your service needs to do a bunch of processing of the photos, e.g. scaling them to different sizes, applying different filters, extracting metadata, etc.
One way to accomplish this would be to post a message to an SQS queue (or perhaps multiple messages to multiple queues, depending on how you architect it). The message(s) describe work that needs to be performed on the newly uploaded image file. Once the message has been written to SQS, your application can return a success to the user because you know that you have the image file and you have scheduled the processing.
In the background, you can have servers reading messages from SQS and performing the work specified in the messages. If one of those servers dies another one will pick up the message and perform the work. SQS guarantees that a message will be delivered eventually so you can be confident that the work will eventually get done.