Happy May 1st,
I'm doing a simple POC to utilize the dead letter topic feature of PusSub. I configured my subscription to republish messages to a separate dead letter topic after 20 Maximum delivery attempts (below is the subscription pull code and sample message used).
Note: I configured the subscription using Cloud Console.
Problem/challenge: Even after 36 delivery attempts the test message is still not republished to the dead letter topic. Based on the documentation I would assume my test message will be republished to the dead letter topic and shouldn't be delivered after 20 attempts. What am I missing?
Pull Subscription code
const {PubSub} = require('#google-cloud/pubsub');
var moment = require('moment');
process.env['GOOGLE_APPLICATION_CREDENTIALS'] = 'abcxyz.json';
const pubSubClient = new PubSub();
const timeout = 100;
async function listenWithCustomAttributes() {
const subscription = pubSubClient.subscription("projects/random-1234/subscriptions/testsub");
// Create an event handler to handle messages
const messageHandler = (message) => {
const datetime = moment().format('mmmm do yyyy, h:mm:ss a');
console.log(`${datetime}::: ${message.id}:`);
console.log(`${message.data}`);
console.log(`Delivery Attempt: ${message.deliveryAttempt}`);
console.log(`custom Attributes: ${JSON.stringify(message.attributes)}`);
console.log('\n');
//NACK for re-delivery
message.nack();
};
subscription.on('message', messageHandler);
setTimeout(() => {
subscription.removeListener('message', messageHandler);
}, timeout * 1000000);
}
listenWithCustomAttributes();
Sample PubSub message
const message = {
"event": "First",
"message": "HELLOWORLD!!!!",
};
I finally was able to address this issue.
According to documentation "If you configured the subscription using Cloud Console, the roles are granted automatically." But, that no longer seems valid. We need to grant the required publisher & subscriber role in "DEAD LETTERING" (beside OVERVIEW) in the console of the subscription or add iam policy as described in the docs.
Related
I am trying to create a Cloud Function that is triggered from a PubSub subscription, but I need to have the message ordering enabled. I know to use the event_trigger block in the google_cloudfunctions_function block, when creating a function linked to a subscription. However this does not like the enable_message_ordering as described under PubSub. When using the subscription push config, I don't know how I can get link the endpoint to the function.
So is there a way I can link the function to a subscription with message ordering enabled?
Can I just use the internal URL to the function as the push config URL?
You can't use background functions triggered by PubSub and message ordering (or filtering).
You have do deploy a HTTP functions (take care, the signature of the fonction change, and the the format of the PubSub message also change slightly).
Then create a PubSub PUSH subscriptions, use the Cloud Functions URL. The best is also to add a Service Account on PubSub to allow only it to call your Functions.
For completeness I wanted to add the terraform that I used to do this. In case others are looking.
# This is the HTTP function that processes the events from PubSub, note it is set as a HTTP trigger
resource "google_cloudfunctions_function" "processEvent" {
name = "processEvent"
runtime = var.RUNTIME
environment_variables = {
GCP_PROJECT_ID = var.GCP_PROJECT_ID
LOG_LEVEL = var.LOG_LEVEL
}
available_memory_mb = var.AVAILABLE_MEMORY
timeout = var.TIMEOUT
source_archive_bucket = var.SOURCE_ARCHIVE_BUCKET
source_archive_object = google_storage_bucket_object.processor-archive.name
trigger_http = true
entry_point = "processEvent"
}
# define the topic
resource "google_pubsub_topic" "event-topic" {
name = "event-topic"
}
# We need to create the subscription specifically as we need to enable message ordering
resource "google_pubsub_subscription" "processEvent_subscription" {
name = "processEvent_subscription"
topic = google_pubsub_topic.event-topic.name
ack_deadline_seconds = 20
push_config {
push_endpoint = "https://${var.REGION}-${var.GCP_PROJECT_ID}.cloudfunctions.net/${google_cloudfunctions_function.processEvent.name}"
oidc_token {
# a new IAM service account is need to allow the subscription to trigger the function
service_account_email = "cloudfunctioninvoker#${var.GCP_PROJECT_ID}.iam.gserviceaccount.com"
}
}
enable_message_ordering = true
}
We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.
I have a simple contact flow like below from which I trigger the call from Amazon Connect (claimed phone number in AWS Connect) to the end customer (real customer phone number):
Now I want to connect an agent in the Amazon Connect end.
When I trigger the following code, I need to trigger the call from the Amazon Connect (Customer Agent) to the end customer (Real customer phone number)
const AWS = require('aws-sdk');
AWS.config.update({ region: 'us-east-1' });
exports.handler = (event, context, callback) => {
let connect = new AWS.Connect();
const customerName = event.name;
const customerPhoneNumber = event.number;
const dayOfWeek = event.day;
let params = {
"InstanceId" : '12345l-abcd-1234-abcde-123456789bcde',
"ContactFlowId" : '987654-lkjhgf-9875-abcde-poiuyt0987645',
"SourcePhoneNumber" : '+1123456789',
"DestinationPhoneNumber" : customerPhoneNumber,
"Attributes" : {
'name' : customerName,
'dayOfWeek' : dayOfWeek
}
}
connect.startOutboundVoiceContact(
params, function (error, response){
if(error) {
console.log(error)
callback("Error", null);
} else
{
console.log('Initiated an outbound call with Contact Id ' + JSON.stringify(response.ContactId));
callback(null, 'Success');
}
}
);
};
How to add the customer agent in the contact flow?
Logging is not working (Not able to find any logs in CloudWatch AWS)
Is my call recording added in the right section in contact flow?
To connect the call to an agent, you need to add a “set working queue” block to set the call to route to a queue where you have available agents. After you set your queue, replace the “disconnect / hang up” block with a “transfer to queue” block. This will route the call to an available agent or queue the call if no agent is immediately available.
Recording will only occur for the portion of the call between the agent and the outside party, so you won’t see any recordings for calls that didn’t get connected to an agent. Since you have the “set recording behavior” block set to “customer and agent” in your flow already, you should get a recording file when the call gets connected to an agent with the steps above.
This can be considered a follow-up to this thread, but I need more help with moving things along. Hopefully someone can have a look over my attempts below and provide further guidance.
To summarize, I need a cloud function that
Is triggered by a PubSub message being published in topic A (this can be done in UI).
reads a messy object change notification message in "push" PubSub topic A.
"parse" it
publish a message in PubSub topic B, with the original message ID as data, and other metadata (e.g. file name, size, time) as attributes.
. 1:
Example of a messy object change notification:
\n "kind": "storage#object",\n "id": "bucketcfpubsub/test.txt/1544681756538155",\n "selfLink": "https://www.googleapis.com/storage/v1/b/bucketcfpubsub/o/test.txt",\n "name": "test.txt",\n "bucket": "bucketcfpubsub",\n "generation": "1544681756538155",\n "metageneration": "1",\n "contentType": "text/plain",\n "timeCreated": "2018-12-13T06:15:56.537Z",\n "updated": "2018-12-13T06:15:56.537Z",\n "storageClass": "STANDARD",\n "timeStorageClassUpdated": "2018-12-13T06:15:56.537Z",\n "size": "1938",\n "md5Hash": "sDSXIvkR/PBg4mHyIUIvww==",\n "mediaLink": "https://www.googleapis.com/download/storage/v1/b/bucketcfpubsub/o/test.txt?generation=1544681756538155&alt=media",\n "crc32c": "UDhyzw==",\n "etag": "CKvqjvuTnN8CEAE="\n}\n
To clarify, is this a message with blank "data" field, and all the information above are in attribute pairs (like "attribute name": "attribute data")? Or is it just a long string stuffed into the "data" field, with no "attributes"?
. 2:
In the above thread, a "pull" subscription is used. Is it better than using a "push" subscription? Push sample below:
def create_push_subscription(project_id,
topic_name,
subscription_name,
endpoint):
"""Create a new push subscription on the given topic."""
# [START pubsub_create_push_subscription]
from google.cloud import pubsub_v1
# TODO project_id = "Your Google Cloud Project ID"
# TODO topic_name = "Your Pub/Sub topic name"
# TODO subscription_name = "Your Pub/Sub subscription name"
# TODO endpoint = "https://my-test-project.appspot.com/push"
subscriber = pubsub_v1.SubscriberClient()
topic_path = subscriber.topic_path(project_id, topic_name)
subscription_path = subscriber.subscription_path(
project_id, subscription_name)
push_config = pubsub_v1.types.PushConfig(
push_endpoint=endpoint)
subscription = subscriber.create_subscription(
subscription_path, topic_path, push_config)
print('Push subscription created: {}'.format(subscription))
print('Endpoint for subscription is: {}'.format(endpoint))
# [END pubsub_create_push_subscription]
Or do I need further code after this to receive messages?
Also, doesn't this create a new subscriber every time the Cloud Function is triggered by a pubsub message being published? Should I add a subscription delete code at the end of the CF, or are there more efficient ways to do this?
. 3:
Next, to parse the code, this sample code doing a few attributes as follows:
def summarize(message):
# [START parse_message]
data = message.data
attributes = message.attributes
event_type = attributes['eventType']
bucket_id = attributes['bucketId']
object_id = attributes['objectId']
Will this work with my above notification in 1:?
. 4:
How do I separate the topic_name? Steps 1 and 2 use topic A, while this step is to publish into topic B. Is is as simple as re-writing the topic_name in the below code example?
# TODO topic_name = "Your Pub/Sub topic name"
publisher = pubsub_v1.PublisherClient()
topic_path = publisher.topic_path(project_id, topic_name)
for n in range(1, 10):
data = u'Message number {}'.format(n)
# Data must be a bytestring
data = data.encode('utf-8')
# Add two attributes, origin and username, to the message
publisher.publish(
topic_path, data, origin='python-sample', username='gcp')
print('Published messages with custom attributes.')
Source where I got most of the sample code from (besides the above thread):python-docs-samples. Will adapting and stringing the above code samples together produce useful code? Or will I still be missing stuff like "import ****"?
You should not attempt to manually create a Subscriber running in Cloud Functions. Instead, follow the documentation here for setting up a Cloud Function which will be called with all messages sent to a given topic by passing the --trigger-topic command line parameter.
To address some of your other concerns:
“Should I add a subscription delete code at the end of the CF”- Subscriptions are long-lived resources corresponding to a specific backlog of messages. If the subscription is created and deleted at the end of the cloud function, messages sent when it does not exist will not be received.
“How do I separate the topic_name”- The ‘topic_name’ in this example refers to the last part of the string formatted like this projects/project_id/topics/topic_name that will appear on this page in the cloud console for your topic after it has been created.
How do I list all SQS queues in an AWS account programmatically via the API and .Net SDK?
I am already doing something similar with DynamoDb tables, and that's fairly straightforward - you can page through results using ListTables in a loop until you have them all.
However the equivalent SQS Api endpoint, ListQueues is different and not as useful. It returns up to 1000 queues, with no option of paging.
Yes, there can be over 1000 queues in my case. I have had a query return exactly 1000 results. It's all in 1 region, so it's not the same as this question.
You can retrieve SQS queue names from Cloudwatch, which supports paging. It will only return queues that are considered active.
An active queue is described as:
A queue is considered active by CloudWatch for up to six hours from
the last activity (for example, any API call) on the queue.
Something like this should work:
var client = new AmazonCloudWatchClient(RegionEndpoint.EUWest1);
string nextToken = null;
var results = Enumerable.Empty<string>();
do
{
var result = client.ListMetrics(new ListMetricsRequest()
{
MetricName = "ApproximateAgeOfOldestMessage",
NextToken = nextToken
});
results = results.Concat(
result
.Metrics
.SelectMany(x => x.Dimensions.Where(d => d.Name == "QueueName")
.Select(d => d.Value))
);
nextToken = result.NextToken;
} while (nextToken != null);