I have a Google Cloud Function subscribed to a topic. Our Go API publishes a message to the topic when an email needs to be sent to a user. The GCF creates the email object and sends it to Sendgrid. The problem is that 90% of the time, the cloud functions gets invoked twice.
The acknowledgement deadline on the subscription is 600 seconds and it's clearly stated in the Docs that GCF acknowledges internally.
I understand that PubSub guarantees at-least-once delivery and GCF at-least-once execution for background functions. But still, this happens in most cases, I'm pretty sure that's not right either.
I'm 100% sure it's not our API that's sending 2 messages. The cloud function runs twice even when I manually publish a message from the GCP console to test.
So the execution_id is the same. Both executions take less than 1 second.
So I'm not sure what's going on, who is responsible for this duplication?
I'm guessing it's GCF seeing as both executions have the same ID?
Does anyone have any ideas about how to fix this?
I met almost the same situation. I fixed it by deleting Cloud Functions' entries and Cloud Pub/Sub's subscriptions, then recreating them. It seems work fine so far.
Related
I have a Google Cloud Function subscribed to a topic. Our GCP Pub/Sub publishes a message to the topic when cloud scheduler invoke GCP Pub/Sub each 5 minutes. The problem is that the cloud functions gets sometimes invoked twice 90s after invoking first one.
The acknowledgement deadline on the subscription is 600 seconds.
So, I can't figure it out why GCF is invoked twice in 90s by GCP Pub/Sub.
Does invoking twice 90s after related to something?
Your duplicate could either be on the publish side or on the subscribe side. If the duplicate messages have different message IDs, then your duplicates are generated on the publish side. This could be caused by retries on the publish side in response to retryable errors. If the messages have the same message ID, then the duplication is on the subscribe side within Pub/Sub.
Cloud Pub/Sub offers at-least-once delivery semantics. That means it is possible for duplicates to occur, even if you acknowledge the message and even if the acknowledgement deadline has not passed. If you want stronger guarantees around delivery, you can use Pub/Sub's exactly once feature, which is currently in public preview. However, this will require you to set up your Cloud Function with an HTTP trigger and to create a push subscription in Pub/Sub that points to the address of the function because there is no way to set the exactly once setting on a subscription created by Cloud Functions.
I currently have a pub/sub push subscription that pushes to a http endpoint. This endpoint then triggers my cloud function. I am running into an issue where the same events that have already been sent to my cloud function are being resent by the pub/sub subscription. I increased my subscription's ack deadline to 3 minutes but after about a minute into my cloud functions execution, it will resend the same event that has already been processed. This leads to multiple invocations of my cloud function and further issues. I haven't seen any way to disable pub/sub retries but wondering if there are any suggestions as to a root cause of this or any work arounds?
Current set-up:
cloud function timeout limit: 120seconds
pub/sub subscription ack deadline: 180seconds
dead-lettering after 5 retries
You will need to consider idempotency and flag any recent retries to prevent them from firing again. This could be a timestamp stored in a database and filter based on time and any metadata you contain. Another important thing is to return a successful result.
Doug covers this concept in a video, while it doesn't reference pubsub, it is still just as valid: https://www.youtube.com/watch?v=Pwsy8XR7HNE
Is there a way for me to run a cloud function to update some data whenever there is lots of activity to one of my pages? sorry if this is a stupid question.
Cloud Functions does not have any triggers that automatically respond to the load on a web site. You will have to find some other way to gauge that traffic, and then perhaps invoke an HTTP trigger directly.
If you want to see the complete list of trigger types, see the documentation.
as far as I know there is no such thing as a direct trigger between a spike of events in service A and cloud function (unless the service A is itself a cloud function, that would indeed scale).
However it exists second party trigger with stackdriver that allows you to define rules that once hit would deliver an event to a pub/sub topic that a cloud function of your choice can suscribe to. Be cautious with what you intend to do with your cloud function, you may have concurency issues depending the technology of the database you want to modify. In other word, try to be as idempotent as possible.
I think you may be interested by this community tutorial.
I have an AWS Lambda that was triggered by SNS message. Many time, it has reached the max duration allowed by AWS, and AWS killed it immediately.
I have to either dig into the Lambda logs or the lambda duration chart to find out about the error.
Are there a better way to report this kind of errors?
Yes, there are some 3rd party tools that help you monitor your environment and provide exactly that - filter on specific errors and drill down to what happened there (the input event, the outgoing HTTP requests etc.).
Moreover, you can also configure alerts on specific errors that you will get via slack/mail.
Disclosure: I work for Lumigo, a company that does exactly that.
I deployed a service written in Python2.7 using AWS Lambda, and it's about extracting data from some pages and sending results to a web app. The service is triggered by the AWS CloudWatch event (fixed rate of 5 mins).
However, I found out sometimes the service was triggered twice at a time. I got this because there were two log stream printed the same data and result but with different RequestID's. And the database had duplicate data, which showed that both worked successfully. It looked like the service was triggered twice almost at the same time for no reasons.
Does anyone experience the same thing, and how do you fix it? Or, is there a way to limit only one function can be executed at a time.
Yes. Some AWS services have SLA of at least once delivery. I have experienced this with CloudWatch and CloudTrail. I do not know if you can limit it only once. You have to check if the data has been processed already. I overcame this by making boto3 calls in my python code before processing the data. Without knowing your situation, it is difficult to suggest a solution.