How do I list all SQS queues in an AWS account programmatically via the API and .Net SDK?
I am already doing something similar with DynamoDb tables, and that's fairly straightforward - you can page through results using ListTables in a loop until you have them all.
However the equivalent SQS Api endpoint, ListQueues is different and not as useful. It returns up to 1000 queues, with no option of paging.
Yes, there can be over 1000 queues in my case. I have had a query return exactly 1000 results. It's all in 1 region, so it's not the same as this question.
You can retrieve SQS queue names from Cloudwatch, which supports paging. It will only return queues that are considered active.
An active queue is described as:
A queue is considered active by CloudWatch for up to six hours from
the last activity (for example, any API call) on the queue.
Something like this should work:
var client = new AmazonCloudWatchClient(RegionEndpoint.EUWest1);
string nextToken = null;
var results = Enumerable.Empty<string>();
do
{
var result = client.ListMetrics(new ListMetricsRequest()
{
MetricName = "ApproximateAgeOfOldestMessage",
NextToken = nextToken
});
results = results.Concat(
result
.Metrics
.SelectMany(x => x.Dimensions.Where(d => d.Name == "QueueName")
.Select(d => d.Value))
);
nextToken = result.NextToken;
} while (nextToken != null);
Related
I have a lambda that exports all of our loggroups to s3, and am currently using cloudwatchlogs.describeLogGroups to list
all of our logGroups.
const logGroupsResponse = await cloudwatchlogs.describeLogGroups({ limit: 50 })
The issue is that we have 69 logGroups is there any way to list (ids, and names) of absolutely all logGroups in an aws account. I see it's possible to have 1000 log groups. This is a screenshot of our console:
How come cloudwatchlogs.describeLogGroups just allows a limit of 50 which is very small?
Assuming that you are using AWS JS SDK v2, describeLogGroups API provides a nextToken in its response and also accepts a nexToken. This token is used for retrieving multiple log groups (more than 50) by sending multiple requests. We can use the following pattern to accomplish this:
const cloudwatchlogs = new AWS.CloudWatchLogs({region: 'us-east-1'});
let nextToken = null;
do {
const logGroupsResponse = await cloudwatchlogs.describeLogGroups({limit: 50, nextToken: nextToken}).promise();
// Do something with the retrieved log groups
console.log(logGroupsResponse.logGroups.map(group => group.arn));
// Get the next token. If there are no more log groups, the token will be undefined
nextToken = logGroupsResponse.nextToken;
} while (nextToken);
We are querying the AWS API in loop until there are no more log groups left.
Happy May 1st,
I'm doing a simple POC to utilize the dead letter topic feature of PusSub. I configured my subscription to republish messages to a separate dead letter topic after 20 Maximum delivery attempts (below is the subscription pull code and sample message used).
Note: I configured the subscription using Cloud Console.
Problem/challenge: Even after 36 delivery attempts the test message is still not republished to the dead letter topic. Based on the documentation I would assume my test message will be republished to the dead letter topic and shouldn't be delivered after 20 attempts. What am I missing?
Pull Subscription code
const {PubSub} = require('#google-cloud/pubsub');
var moment = require('moment');
process.env['GOOGLE_APPLICATION_CREDENTIALS'] = 'abcxyz.json';
const pubSubClient = new PubSub();
const timeout = 100;
async function listenWithCustomAttributes() {
const subscription = pubSubClient.subscription("projects/random-1234/subscriptions/testsub");
// Create an event handler to handle messages
const messageHandler = (message) => {
const datetime = moment().format('mmmm do yyyy, h:mm:ss a');
console.log(`${datetime}::: ${message.id}:`);
console.log(`${message.data}`);
console.log(`Delivery Attempt: ${message.deliveryAttempt}`);
console.log(`custom Attributes: ${JSON.stringify(message.attributes)}`);
console.log('\n');
//NACK for re-delivery
message.nack();
};
subscription.on('message', messageHandler);
setTimeout(() => {
subscription.removeListener('message', messageHandler);
}, timeout * 1000000);
}
listenWithCustomAttributes();
Sample PubSub message
const message = {
"event": "First",
"message": "HELLOWORLD!!!!",
};
I finally was able to address this issue.
According to documentation "If you configured the subscription using Cloud Console, the roles are granted automatically." But, that no longer seems valid. We need to grant the required publisher & subscriber role in "DEAD LETTERING" (beside OVERVIEW) in the console of the subscription or add iam policy as described in the docs.
We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.
I really tried everything. Surprisingly google has not many answers when it comes to this.
When a certain .csv file is uploaded to a S3 bucket I want to parse it and place the data into a RDS database.
My goal is to learn the lambda serverless technology, this is essentially an exercise. Thus, I over-engineered the hell out of it.
Here is how it goes:
S3 Trigger when the .csv is uploaded -> call lambda (this part fully works)
AAA_Thomas_DailyOverframeS3CsvToAnalytics_DownloadCsv downloads the csv from S3 and finishes with essentially the plaintext of the file. It is then supposed to pass it to the next lambda. The way I am trying to do this is by putting the second lambda as destination. The function works, but the second lambda is never called and I don't know why.
AAA_Thomas_DailyOverframeS3CsvToAnalytics_ParseCsv gets the plaintext as input and returns a javascript object with the parsed data.
AAA_Thomas_DailyOverframeS3CsvToAnalytics_DecryptRDSPass only connects to KMS, gets the encrcypted RDS password, and passes it along with the data it received as input to the last lambda.
AAA_Thomas_DailyOverframeS3CsvToAnalytics_PutDataInRds then finally puts the data in RDS.
I created a custom VPC with custom subnets, route tables, gateways, peering connections, etc. I don't know if this is relevant but function 2. only has access to the s3 endpoint, 3. does not have any internet access whatsoever, 4. is the only one that has normal internet access (it's the only way to connect to KSM), and 5. only has access to the peered VPC which hosts the RDS.
This is the code of the first lambda:
// dependencies
const AWS = require('aws-sdk');
const util = require('util');
const s3 = new AWS.S3();
let region = process.env;
exports.handler = async (event, context, callback) =>
{
var checkDates = process.env.CheckDates == "false" ? false : true;
var ret = [];
var checkFileDate = function(actualFileName)
{
if (!checkDates)
return true;
var d = new Date();
var expectedFileName = 'Overframe_-_Analytics_by_Day_Device_' + d.getUTCFullYear() + '-' + (d.getUTCMonth().toString().length == 1 ? "0" + d.getUTCMonth() : d.getUTCMonth()) + '-' + (d.getUTCDate().toString().length == 1 ? "0" + d.getUTCDate() : d.getUTCDate());
return expectedFileName == actualFileName.substr(0, expectedFileName.length);
};
for (var i = 0; i < event.Records.length; ++i)
{
var record = event.Records[i];
try {
if (record.s3.bucket.name != process.env.S3BucketName)
{
console.error('Unexpected notification, unknown bucket: ' + record.s3.bucket.name);
continue;
}
if (!checkFileDate(record.s3.object.key))
{
console.error('Unexpected file, or date is not today\'s: ' + record.s3.object.key);
continue;
}
const params = {
Bucket: record.s3.bucket.name,
Key: record.s3.object.key
};
var csvFile = await s3.getObject(params).promise();
var allText = csvFile.Body.toString('utf-8');
console.log('Loaded data:', {Bucket: params.Bucket, Filename: params.Key, Text: allText});
ret.push(allText);
} catch (error) {
console.log("Couldn't download CSV from S3", error);
return { statusCode: 500, body: error };
}
}
// I've been randomly trying different ways to return the data, none works. The data itself is correct , I checked with console.log()
const response = {
statusCode: 200,
body: { "Records": ret }
};
return ret;
};
While this shows how the lambda was set up, especially its destination:
I haven't posted on Stackoverflow in 7 years. That's how desperate I am. Thanks for the help.
Rather than getting each Lambda to call the next one take a look at AWS managed service for state machines, step functions which can handle this workflow for you.
By providing input and outputs you can pass output to the next function, with retry logic built into it.
If you haven't much experience AWS has a tutorial on setting up a step function through chaining Lambdas.
By using this you also will not need to account for configuration issues such as Lambda timeouts. In addition it allows your code to be more modular which improves testing the individual functionality, whilst also isolating issues.
The execution roles of all Lambda functions, whose destinations include other Lambda functions, must have the lambda:InvokeFunction IAM permission in one of their attached IAM policies.
Here's a snippet from Lambda documentation:
To send events to a destination, your function needs additional permissions. Add a policy with the required permissions to your function's execution role. Each destination service requires a different permission, as follows:
Amazon SQS – sqs:SendMessage
Amazon SNS – sns:Publish
Lambda – lambda:InvokeFunction
EventBridge – events:PutEvents
I create an SQS queue in boto3 and immediately look for it via sqs.list_queues but it won't return anything.
when I input the SQS queue name into the console, it won't return anything until I input it again the second time.
So does this mean I need to call list_queues twice? Why is this happening? Why isn't AWS return queues that was immediately created before?
sqs = boto3.client('sqs')
myQ = sqs.create_queue(QueueName='just_created')
response = sqs.list_queues(
QueueNamePrefix='just_created'
)
response does not contain the usual array of QueueUrls
Just like many AWS services, SQS control plane is eventually consistent, meaning that it takes a while to propagate the data accross the systems.
If you need the URL of the queue you just created, you can find it in the return value of the create_queue call.
The following operation creates an SQS queue named MyQueue.
response = client.create_queue(
QueueName='MyQueue',
)
print(response)
Expected Output:
{
'QueueUrl': 'https://queue.amazonaws.com/012345678910/MyQueue',
'ResponseMetadata': {
'...': '...',
},
}