While using persistent actors and using jdbc for the journal. All messages to my persistent actor are dead letters. However, I cannot see the reason for this since I send them directly to the persistent actor.
Persistent actor code:
case class ExampleState(events: List[String] = Nil) {
def updated(evt: Evt): ExampleState = copy(evt.data :: events)
def size: Int = events.length
override def toString: String = events.reverse.toString
}
class ExampleActor extends PersistentActor {
override def persistenceId = "sample-id-1"
var state = ExampleState()
def updateState(event: Evt): Unit = {
state = state.updated(event)
}
def numEvents =
state.size
override def receiveRecover: Receive = {
case evt: Evt => updateState(evt)
case SnapshotOffer(_, snapshot: ExampleState) => state = snapshot
}
val snapShotInterval = 1000
override def receiveCommand: Receive= {
case Cmd(data) => {
println("in the command code block")
persist(Evt(s"${data}-${numEvents}")) { event => {
updateState(event)
context.system.eventStream.publish(event)
if (lastSequenceNr % snapShotInterval == 0 && lastSequenceNr != 0)
saveSnapshot(state)
}
}
}
case Shutdown => context.stop(self)
case "print"=>println(state)
}
}
Test code (all messages sent to persistent actor are the dead letters):
"The example persistent actor" should {
"Test Command" in {
val persistentActor = system.actorOf(Props[ExampleActor](),"examplePersistentactor")
Thread.sleep(2000)
println("before the send")
persistentActor ! Cmd("foo")
persistentActor ! Cmd("bar")
persistentActor ! Cmd("fizz")
persistentActor ! Cmd("buzz")
persistentActor ! "print"
Thread.sleep(10000)
persistentActor ! Shutdown
println("after messages should be sent and received")
}
}
Dead letters happens when there is no actor instance properly running. The delivery message process is the same no matter whether the actor to which message was sent is a persistent actor or not.
So, I suppose your persistent actor is not actually running when you send a message to it. That could be because of persistence settings are not configured properly.
I ran your code (changing Cmd and Evt for String type) using In-Memory persistence and it worked.
Thanks for your reply! Are you by any chance familiar with the jdbc plugin? I configured the journal/snapshot and slick connection to the database and it seems to fit with the documentation of the plugin. Is there anything you can see that is wrong/missing by any chance?
application.conf:
akka {
loglevel = DEBUG
}
# Copyright 2016 Dennis Vriend
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# general.conf is included only for shared settings used for the akka-persistence-jdbc tests
include "general.conf"
akka {
persistence {
journal {
plugin = "jdbc-journal"
# Enable the line below to automatically start the journal when the actorsystem is started
auto-start-journals = ["jdbc-journal"]
}
snapshot-store {
plugin = "jdbc-snapshot-store"
# Enable the line below to automatically start the snapshot-store when the actorsystem is started
auto-start-snapshot-stores = ["jdbc-snapshot-store"]
}
}
}
jdbc-journal {
slick = ${slick}
}
# the akka-persistence-snapshot-store in use
jdbc-snapshot-store {
slick = ${slick}
}
# the akka-persistence-query provider in use
jdbc-read-journal {
slick = ${slick}
}
slick {
profile = "slick.jdbc.MySQLProfile$"
db {
url = "jdbc:mysql://localhost:3306/squirrel_persistence/mysql?cachePrepStmts=true&cacheCallableStmts=true&cacheServerConfiguration=true&useLocalSessionState=true&elideSetAutoCommits=true&alwaysSendSetIsolation=false&enableQueryTimeouts=false&connectionAttributes=none&verifyServerCertificate=false&useSSL=false&allowPublicKeyRetrieval=true&useUnicode=true&useLegacyDatetimeCode=false&serverTimezone=UTC&rewriteBatchedStatements=true"
user = "root"
password = ""
driver = "com.mysql.jdbc.Driver"
numThreads = 5
maxConnections = 5
minConnections = 1
}
}
Related
In my project, some EC2 instances will be shut down. These instances will only be connected when the user needs to work.
Users will access the instances using a clientless remote desktop gateway called Apache Guacamole.
If the instance is stopped, how start an EC2 instance through Apache Guacamole?
Home Screen
Guacamole is, essentially, an RDP/VNC/SSH client and I don't think you can get the instances to startup by themselves since there is no possibility for a wake-on-LAN feature or something like it out-of-the-box.
I used to have a similar issue and we always had one instance up and running and used it to run the AWS CLI to startup the instances we wanted.
Alternatively you could modify the calls from Guacamole to invoke a Lambda function to check if the instance you wish to connect to is running and start it up if not; but then you'd have to deal with the timeout for starting a session from Guacamole (not sure if this is a configurable value from the web admin console, or files), or set up another way of getting feedback for when your instance becomes available.
There was a discussion in the Guacamole mailing list regarding Wake-on-LAN feature and one approach was proposed. It is based on the script that monitors connection attempts and launches instances when needed.
Although it is more a workaround, maybe it will be helpful for you. For the proper solution, it is possible to develop an extension.
You may find the discussion and a link to the script here:
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/guacamole-and-wake-on-LAN-td7526.html
http://apache-guacamole-general-user-mailing-list.2363388.n4.nabble.com/Wake-on-lan-function-working-td2832.html
There is unfortunately not a very simple solution. The Lambda approach is the way we solved it.
Guacamole has a feature that logs accesses to Cloudwatch Logs.
So next we need the the information of the connection_id and the username/id as a tag on the instance. We are automatically assigning theses tags with our back-end tool when starting the instances.
Now when a user connects to a machine, a log is written to Cloudwatch Logs.
A filter is applied to only get login attempts and trigger Lambda.
The triggered Lambda script checks if there is an instance with such tags corresponding to the current connection attempt and if the instance is stopped, plus other constraints, like if an instance is expired for example.
If yes, then the instance gets started, and in roughly 40 seconds the user is able to connect.
The lambda scripts looks like this:
#receive information from cloudwatch event, parse it call function to start instances
import re
import boto3
import datetime
from conn_inc import *
from start_instance import *
def lambda_handler(event, context):
# Variables
region = "eu-central-1"
cw_region = "eu-central-1"
# Clients
ec2Client = boto3.client('ec2')
# Session
session = boto3.Session(region_name=region)
# Resource
ec2 = session.resource('ec2', region)
print(event)
#print ("awsdata: ", event['awslogs']['data'])
userdata ={}
userdata = get_userdata(event['awslogs']['data'])
print ("logDataUserName: ", userdata["logDataUserName"], "connection_ids: ", userdata["startConnectionId"])
start_instance(ec2,ec2Client, userdata["logDataUserName"],userdata["startConnectionId"])
import boto3
import datetime
from datetime import date
import gzip
import json
import base64
from start_related_instances import *
def start_instance(ec2,ec2Client,logDataUserName,startConnectionId):
# Boto 3
# Use the filter() method of the instances collection to retrieve
# all stopped EC2 instances which have the tag connection_ids.
instances = ec2.instances.filter(
Filters=[
{
'Name': 'instance-state-name',
'Values': ['stopped'],
},
{
'Name': 'tag:connection_ids',
'Values': [f"*{startConnectionId}*"],
}
]
)
# print ("instances: ", list(instances))
#check if instances are found
if len(list(instances)) == 0:
print("No instances with connectionId ", startConnectionId, " found that is stopped.")
else:
for instance in instances:
print(instance.id, instance.instance_type)
expire = ""
connectionName = ""
for tag in instance.tags:
if tag["Key"] == 'expire': #get expiration date
expire = tag["Value"]
if (expire == ""):
print ("Start instance: ", instance.id, ", no expire found")
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print("Check if instance already expired.")
splitDate = expire.split(".")
expire = datetime.datetime(int(splitDate[2]) , int(splitDate[1]) , int(splitDate[0]) )
args = date.today().timetuple()[:6]
today = datetime.datetime(*args)
if (expire >= today):
print("Instance is not yet expired.")
print ("Start instance: ", instance.id, "expire: ", expire, ", today: ", today)
ec2Client.start_instances(
InstanceIds=[instance.id]
)
else:
print ("Instance not started, because it already expired: ", instance.id,"expiration: ", f"{expire}", "today:", f"{today}")
def get_userdata(cw_data):
compressed_payload = base64.b64decode(cw_data)
uncompressed_payload = gzip.decompress(compressed_payload)
payload = json.loads(uncompressed_payload)
message = ""
log_events = payload['logEvents']
for log_event in log_events:
message = log_event['message']
# print(f'LogEvent: {log_event}')
#regex = r"\'.*?\'"
#m = re.search(str(regex), str(message), re.DOTALL)
logDataUserName = message.split('"')[1] #get the username from the user logged into guacamole "Adm_EKoester_1134faD"
startConnectionId = message.split('"')[3] #get the connection Id of the connection which should be started
# create dict
dict={}
dict["connected"] = False
dict["disconnected"] = False
dict["error"] = True
dict["guacamole"] = payload["logStream"]
dict["logDataUserName"] = logDataUserName
dict["startConnectionId"] = startConnectionId
# check for connected or disconnected
ind_connected = message.find("connected to connection")
ind_disconnected = message.find("disconnected from connection")
# print ("ind_connected: ", ind_connected)
# print ("ind_disconnected: ", ind_disconnected)
if ind_connected > 0 and not ind_disconnected > 0:
dict["connected"] = True
dict["error"] = False
elif ind_disconnected > 0 and not ind_connected > 0:
dict["disconnected"] = True
dict["error"] = False
return dict
The cloudwatch logs trigger for lambda like that:
We are developing a GCP Cloud Task based queue process that sends a status email whenever a particular Firestore doc write-trigger fires. The reason we use Cloud Tasks is so a delay can be created (using scheduledTime property 2-min in the future) before the email is sent, and to control dedup (by using a task-name formatted as: [firestore-collection-name]-[doc-id]) since the 'write' trigger on the Firestore doc can be fired several times as the document is being created and then quickly updated by backend cloud functions.
Once the task's delay period has been reached, the cloud-task runs, and the email is sent with updated Firestore document info included. After which the task is deleted from the queue and all is good.
Except:
If the user updates the Firestore doc (say 20 or 30 min later) we want to resend the status email but are unable to create the task using the same task-name. We get the following error:
409 The task cannot be created because a task with this name existed too recently. For more information about task de-duplication see https://cloud.google.com/tasks/docs/reference/rest/v2/projects.locations.queues.tasks/create#body.request_body.FIELDS.task.
This was unexpected as the queue is empty at this point as the last task completed succesfully. The documentation referenced in the error message says:
If the task's queue was created using Cloud Tasks, then another task
with the same name can't be created for ~1hour after the original task
was deleted or executed.
Question: is there some way in which this restriction can be by-passed by lowering the amount of time, or even removing the restriction all together?
The short answer is No. As you've already pointed, the docs are very clear regarding this behavior and you should wait 1 hour to create a task with same name as one that was previously created. The API or Client Libraries does not allow to decrease this time.
Having said that, I would suggest that instead of using the same Task ID, use different ones for the task and add an identifier in the body of the request. For example, using Python:
from google.cloud import tasks_v2
from google.protobuf import timestamp_pb2
import datetime
def create_task(project, queue, location, payload=None, in_seconds=None):
client = tasks_v2.CloudTasksClient()
parent = client.queue_path(project, location, queue)
task = {
'app_engine_http_request': {
'http_method': 'POST',
'relative_uri': '/task/'+queue
}
}
if payload is not None:
converted_payload = payload.encode()
task['app_engine_http_request']['body'] = converted_payload
if in_seconds is not None:
d = datetime.datetime.utcnow() + datetime.timedelta(seconds=in_seconds)
timestamp = timestamp_pb2.Timestamp()
timestamp.FromDatetime(d)
task['schedule_time'] = timestamp
response = client.create_task(parent, task)
print('Created task {}'.format(response.name))
print(response)
#You can change DOCUMENT_ID with USER_ID or something to identify the task
create_task(PROJECT_ID, QUEUE, REGION, DOCUMENT_ID)
Facing a similar problem of requiring to debounce multiple instances of Firestore write-trigger functions, we worked around the default Cloud Tasks task-name based dedup mechanism (still a constraint in Nov 2022) by building a small debounce "helper" using Firestore transactions.
We're using a helper collection _syncHelper_ to implement a delayed throttle for side effects of write-trigger fires - in the OP's case, send 1 email for all writes within 2 minutes.
In our case we are using Firebease Functions task queue utils and not directly interacting with Cloud Tasks but thats immaterial to the solution. The key is to determine the task's execution time in advance and use that as the "dedup key":
async function enqueueTask(shopId) {
const queueName = 'doSomething';
const now = new Date();
const next = new Date(now.getTime() + 2 * 60 * 1000);
try {
const shouldEnqueue = await getFirestore().runTransaction(async t=>{
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate()> now) {
return false;
}
await t.set(syncRef, { timestamp: Timestamp.fromDate(next) });
return true;
});
if (shouldEnqueue) {
let queue = getFunctions().taskQueue(queueName);
await queue.enqueue({
timestamp: next.toISOString(),
},
{ scheduleTime: next }); }
} catch {
...
}
}
This will ensure a new task is enqueued only if the "next execution" time has passed.
The execution operation (also a cloud function in our case) will remove the sync data entry if it hasn't been changed since it was executed:
exports.doSomething = functions.tasks.taskQueue({
retryConfig: {
maxAttempts: 2,
minBackoffSeconds: 60,
},
rateLimits: {
maxConcurrentDispatches: 2,
}
}).onDispatch(async data => {
let { timestamp } = data;
await sendYourEmailHere();
await getFirestore().runTransaction(async t => {
const syncRef = getFirestore().collection('_syncHelper_').doc(<collection_id-doc_id>);
const doc = await t.get(syncRef);
let data = doc.data();
if (data?.timestamp.toDate() <= new Date(timestamp)) {
await t.delete(syncRef);
}
});
});
This isn't a bullet proof solution (if the doSomething() execution function has high latency for example) but good enough for 99% of our use cases.
We are using Alpakka s3 connector to connect to s3 bucket from local system within VPC and getting error as below ,if we use our tradition aws client library we are able to connect to s3 and download file , I am also attaching sample code we are using for alpakka s3 connector.
Is this error because i have to setup some VPC proxy in Code which i use to do with our traditional aws s3 library but i dont see alpakka give option to setup my VPC proxy ?
Error -
akka.stream.StreamTcpException: Tcp command [Connect(bucket-name.s3.amazonaws.com:443,None,List(),Some(10 seconds),true)] failed because of Connect timeout of Some(10 seconds) expired
Code -
override def main(args: Array[String]): Unit = {
implicit val system = ActorSystem()
implicit val materializer = ActorMaterializer()
implicit val executionContext: ExecutionContext =
ExecutionContext.Implicits.global
val awsCredentialsProvider = new AWSStaticCredentialsProvider(
new BasicSessionCredentials("xxxxxx", "xxxx", "xxxx")
)
val regionProvider =
new AwsRegionProvider {
def getRegion: String = "us-east-1"
}
val settings =
new S3Settings(MemoryBufferType, None, awsCredentialsProvider,
regionProvider, false, None, ListBucketVersion2)
val s3Client = new S3Client(settings)(system, materializer)
val future = s3Client.download("bucket_name", "Data/abc.txt", None,
Some(ServerSideEncryption.AES256))
future._2.onComplete {
case Success(value) => println(s"Got the callback, meaning =
value")
case Failure(e) => e.printStackTrace
}
}
You can provide proxy settings in application.conf. Load this config, and provide it to your ActorSystem(name, config) as the config.
akka.stream.alpakka.s3 {
proxy {
host = ""
port = 8000
# use https?
secure = true
}
}
any of the following configuration settings can be overridden in your application.conf: https://github.com/akka/alpakka/blob/master/s3/src/main/resources/reference.conf#L8
What is the best practice to move messages from a dead letter queue back to the original queue in Amazon SQS?
Would it be
Get message from DLQ
Write message to queue
Delete message from DLQ
Or is there a simpler way?
Also, will AWS eventually have a tool in the console to move messages off the DLQ?
Here is a quick hack. This is definitely not the best or recommended option.
Set the main SQS queue as the DLQ for the actual DLQ with Maximum Receives as 1.
View the content in DLQ (This will move the messages to the main queue as this is the DLQ for the actual DLQ)
Remove the setting so that the main queue is no more the DLQ of the actual DLQ
On Dec 1 2021 AWS released the ability to redrive messages from a DLQ back to the source queue(or custom queue).
With dead-letter queue redrive to source queue, you can simplify and enhance your error-handling workflows for standard queues.
Source:
Introducing Amazon Simple Queue Service dead-letter queue redrive to source queues
There are a few scripts out there that do this for you:
npm / nodejs based: http://github.com/garryyao/replay-aws-dlq
# install
npm install replay-aws-dlq;
# use
npx replay-aws-dlq [source_queue_url] [dest_queue_url]
go based: https://github.com/mercury2269/sqsmover
# compile: https://github.com/mercury2269/sqsmover#compiling-from-source
# use
sqsmover -s [source_queue_url] -d [dest_queue_url]
Don't need to move the message because it will come with so many other challenges like duplicate messages, recovery scenarios, lost message, de-duplication check and etc.
Here is the solution which we implemented -
Usually, we use the DLQ for transient errors, not for permanent errors. So took below approach -
Read the message from DLQ like a regular queue
Benefits
To avoid duplicate message processing
Better control on DLQ- Like I put a check, to process only when the regular queue is completely processed.
Scale up the process based on the message on DLQ
Then follow the same code which regular queue is following.
More reliable in case of aborting the job or the process got terminated while processing (e.g. Instance killed or process terminated)
Benefits
Code reusability
Error handling
Recovery and message replay
Extend the message visibility so that no other thread process them.
Benefit
Avoid processing same record by multiple threads.
Delete the message only when either there is a permanent error or successful.
Benefit
Keep processing until we are getting a transient error.
I wrote a small python script to do this, by using boto3 lib:
conf = {
"sqs-access-key": "",
"sqs-secret-key": "",
"reader-sqs-queue": "",
"writer-sqs-queue": "",
"message-group-id": ""
}
import boto3
client = boto3.client(
'sqs',
aws_access_key_id = conf.get('sqs-access-key'),
aws_secret_access_key = conf.get('sqs-secret-key')
)
while True:
messages = client.receive_message(QueueUrl=conf['reader-sqs-queue'], MaxNumberOfMessages=10, WaitTimeSeconds=10)
if 'Messages' in messages:
for m in messages['Messages']:
print(m['Body'])
ret = client.send_message( QueueUrl=conf['writer-sqs-queue'], MessageBody=m['Body'], MessageGroupId=conf['message-group-id'])
print(ret)
client.delete_message(QueueUrl=conf['reader-sqs-queue'], ReceiptHandle=m['ReceiptHandle'])
else:
print('Queue is currently empty or messages are invisible')
break
you can get this script in this link
this script basically can move messages between any arbitrary queues. and it supports fifo queues as well as you can supply the message_group_id field.
That looks like your best option. There is a possibility that your process fails after step 2. In that case you'll end up copying the message twice, but you application should be handling re-delivery of messages (or not care) anyway.
here:
import boto3
import sys
import Queue
import threading
work_queue = Queue.Queue()
sqs = boto3.resource('sqs')
from_q_name = sys.argv[1]
to_q_name = sys.argv[2]
print("From: " + from_q_name + " To: " + to_q_name)
from_q = sqs.get_queue_by_name(QueueName=from_q_name)
to_q = sqs.get_queue_by_name(QueueName=to_q_name)
def process_queue():
while True:
messages = work_queue.get()
bodies = list()
for i in range(0, len(messages)):
bodies.append({'Id': str(i+1), 'MessageBody': messages[i].body})
to_q.send_messages(Entries=bodies)
for message in messages:
print("Coppied " + str(message.body))
message.delete()
for i in range(10):
t = threading.Thread(target=process_queue)
t.daemon = True
t.start()
while True:
messages = list()
for message in from_q.receive_messages(
MaxNumberOfMessages=10,
VisibilityTimeout=123,
WaitTimeSeconds=20):
messages.append(message)
work_queue.put(messages)
work_queue.join()
DLQ comes into play only when the original consumer fails to consume message successfully after various attempts. We do not want to delete the message since we believe we can still do something with it (maybe attempt to process again or log it or collect some stats) and we do not want to keep encountering this message again and again and stop the ability to process other messages behind this one.
DLQ is nothing but just another queue. Which means we would need to write a consumer for DLQ that would ideally run less frequently (compared to original queue) that would consume from DLQ and produce message back into the original queue and delete it from DLQ - if thats the intended behavior and we think original consumer would be now ready to process it again. It should be OK if this cycle continues for a while since we now also get an opportunity to manually inspect and make necessary changes and deploy another version of original consumer without losing the message (within the message retention period of course - which is 4 days by default).
Would be nice if AWS provides this capability out of the box but I don't see it yet - they're leaving this to the end user to use it in way they feel appropriate.
There is a another way to achieve this without writing single line of code.
Consider your actual queue name is SQS_Queue and the DLQ for it is SQS_DLQ.
Now follow these steps:
Set SQS_Queue as the dlq of SQS_DLQ. Since SQS_DLQ is already a dlq of SQS_Queue. Now, both are acting as the dlq of the other.
Set max receive count of your SQS_DLQ to 1.
Now read messages from SQS_DLQ console. Since message receive count is 1, it will send all the message to its own dlq which is your actual SQS_Queue queue.
We use the following script to redrive message from src queue to tgt queue:
filename: redrive.py
usage: python redrive.py -s {source queue name} -t {target queue name}
'''
This script is used to redrive message in (src) queue to (tgt) queue
The solution is to set the Target Queue as the Source Queue's Dead Letter Queue.
Also set Source Queue's redrive policy, Maximum Receives to 1.
Also set Source Queue's VisibilityTimeout to 5 seconds (a small period)
Then read data from the Source Queue.
Source Queue's Redrive Policy will copy the message to the Target Queue.
'''
import argparse
import json
import boto3
sqs = boto3.client('sqs')
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-s', '--src', required=True,
help='Name of source SQS')
parser.add_argument('-t', '--tgt', required=True,
help='Name of targeted SQS')
args = parser.parse_args()
return args
def verify_queue(queue_name):
queue_url = sqs.get_queue_url(QueueName=queue_name)
return True if queue_url.get('QueueUrl') else False
def get_queue_attribute(queue_url):
queue_attributes = sqs.get_queue_attributes(
QueueUrl=queue_url,
AttributeNames=['All'])['Attributes']
print(queue_attributes)
return queue_attributes
def main():
args = parse_args()
for q in [args.src, args.tgt]:
if not verify_queue(q):
print(f"Cannot find {q} in AWS SQS")
src_queue_url = sqs.get_queue_url(QueueName=args.src)['QueueUrl']
target_queue_url = sqs.get_queue_url(QueueName=args.tgt)['QueueUrl']
target_queue_attributes = get_queue_attribute(target_queue_url)
# Set the Source Queue's Redrive policy
redrive_policy = {
'deadLetterTargetArn': target_queue_attributes['QueueArn'],
'maxReceiveCount': '1'
}
sqs.set_queue_attributes(
QueueUrl=src_queue_url,
Attributes={
'VisibilityTimeout': '5',
'RedrivePolicy': json.dumps(redrive_policy)
}
)
get_queue_attribute(src_queue_url)
# read all messages
num_received = 0
while True:
try:
resp = sqs.receive_message(
QueueUrl=src_queue_url,
MaxNumberOfMessages=10,
AttributeNames=['All'],
WaitTimeSeconds=5)
num_message = len(resp.get('Messages', []))
if not num_message:
break
num_received += num_message
except Exception:
break
print(f"Redrive {num_received} messages")
# Reset the Source Queue's Redrive policy
sqs.set_queue_attributes(
QueueUrl=src_queue_url,
Attributes={
'VisibilityTimeout': '30',
'RedrivePolicy': ''
}
)
get_queue_attribute(src_queue_url)
if __name__ == "__main__":
main()
AWS Lambda solution worked well for us -
Detailed instructions:
https://serverlessrepo.aws.amazon.com/applications/arn:aws:serverlessrepo:us-east-1:303769779339:applications~aws-sqs-dlq-redriver
Github: https://github.com/honglu/aws-sqs-dlq-redriver.
Deployed with a click and another click to start the redrive!
Here is also the script (written in Typescript) to move the messages from one AWS queue to another one. Maybe it will be useful for someone.
import {
SQSClient,
ReceiveMessageCommand,
DeleteMessageBatchCommand,
SendMessageBatchCommand,
} from '#aws-sdk/client-sqs'
const AWS_REGION = 'eu-west-1'
const AWS_ACCOUNT = '12345678901'
const DLQ = `https://sqs.${AWS_REGION}.amazonaws.com/${AWS_ACCOUNT}/dead-letter-queue`
const QUEUE = `https://sqs.${AWS_REGION}.amazonaws.com/${AWS_ACCOUNT}/queue`
const loadMessagesFromDLQ = async () => {
const client = new SQSClient({region: AWS_REGION})
const command = new ReceiveMessageCommand({
QueueUrl: DLQ,
MaxNumberOfMessages: 10,
VisibilityTimeout: 60,
})
const response = await client.send(command)
console.log('---------LOAD MESSAGES----------')
console.log(`Loaded: ${response.Messages?.length}`)
console.log(JSON.stringify(response, null, 4))
return response
}
const sendMessagesToQueue = async (entries: Array<{Id: string, MessageBody: string}>) => {
const client = new SQSClient({region: AWS_REGION})
const command = new SendMessageBatchCommand({
QueueUrl: QUEUE,
Entries: entries.map(entry => ({...entry, DelaySeconds: 10})),
// [
// {
// Id: '',
// MessageBody: '',
// DelaySeconds: 10
// }
// ]
})
const response = await client.send(command)
console.log('---------SEND MESSAGES----------')
console.log(`Send: Successful - ${response.Successful?.length}, Failed: ${response.Failed?.length}`)
console.log(JSON.stringify(response, null, 4))
}
const deleteMessagesFromQueue = async (entries: Array<{Id: string, ReceiptHandle: string}>) => {
const client = new SQSClient({region: AWS_REGION})
const command = new DeleteMessageBatchCommand({
QueueUrl: DLQ,
Entries: entries,
// [
// {
// "Id": "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx",
// "ReceiptHandle": "someReceiptHandle"
// }
// ]
})
const response = await client.send(command)
console.log('---------DELETE MESSAGES----------')
console.log(`Delete: Successful - ${response.Successful?.length}, Failed: ${response.Failed?.length}`)
console.log(JSON.stringify(response, null, 4))
}
const run = async () => {
const dlqMessageList = await loadMessagesFromDLQ()
if (!dlqMessageList || !dlqMessageList.Messages) {
console.log('There is no messages in DLQ')
return
}
const sendMsgList: any = dlqMessageList.Messages.map(msg => ({ Id: msg.MessageId, MessageBody: msg.Body}))
const deleteMsgList: any = dlqMessageList.Messages.map(msg => ({ Id: msg.MessageId, ReceiptHandle: msg.ReceiptHandle}))
await sendMessagesToQueue(sendMsgList)
await deleteMessagesFromQueue(deleteMsgList)
}
run()
P.S. The script is with room for improvement, but anyway might be useful.
here is a simple python script you can use from the cli to do the same, depending only on boto3
usage
python redrive_messages __from_queue_name__ __to_queue_name__
code
import sys
import boto3
from src.utils.get_config.get_config import get_config
from src.utils.get_logger import get_logger
sqs = boto3.resource('sqs')
config = get_config()
log = get_logger()
def redrive_messages(from_queue_name:str, to_queue_name:str):
# initialize the queues
from_queue = sqs.get_queue_by_name(QueueName=from_queue_name)
to_queue = sqs.get_queue_by_name(QueueName=to_queue_name)
# begin querying for messages
should_check_for_more = True
messages_processed = []
while (should_check_for_more):
# grab the next message
messages = from_queue.receive_messages(MaxNumberOfMessages=1);
if (len(messages) == 0):
should_check_for_more = False;
break;
message = messages[0]
# requeue it
to_queue.send_message(MessageBody=message.body, DelaySeconds=0)
# let the queue know that the message was processed successfully
messages_processed.append(message)
message.delete()
print(f'requeued {len(messages_processed)} messages')
if __name__ == '__main__':
from_queue_name = sys.argv[1]
to_queue_name = sys.argv[2]
redrive_messages(from_queue_name, to_queue_name)
I am trying to use akka pub-sub with in our application. I have a play application which is part of akka cluster. I want to use akka cluster-client to make make this application listen/subscribe to topics and messages will be published from other applications.
Cluster/Subscriber side code [within Play application]
class MyRealtimeActor extends Actor {
import DistributedPubSubMediator.{ Subscribe, SubscribeAck }
def receive = {
case SubscribeAck(Subscribe("metrics", _)) => {
Logger.info("SUBSCRIBED TO MESSAGES")
context become ready
}
}
def ready: Actor.Receive = {
case m => {
Logger.info("RECEIVED MESSAGE " + m)
}
}
}
and I instantiate like this in Global
val cluster: ActorSystem = ActorSystem("ClusterSystem")
val metricsActor = Global.cluster.actorOf(Props(new MyRealtimeActor), "metricsActor")
ClusterReceptionistExtension(cluster).registerSubscriber("metrics", metricsActor)
and the conf file has the following
akka {
actor {
provider = "akka.cluster.ClusterActorRefProvider"
extensions = ["akka.contrib.pattern.DistributedPubSubExtension",
"akka.contrib.pattern.ClusterReceptionistExtension"]
}
remote {
log-remote-lifecycle-events = off
netty.tcp {
hostname = "127.0.0.1"
port = 2551
}
}
cluster {
seed-nodes = [
"akka.tcp://ClusterSystem#127.0.0.1:2551"
]
auto-down-unreachable-after = 10s
}
When is start the play application i can see the following log
[INFO] [11/06/2013 17:48:42.926] [ClusterSystem-akka.actor.default-dispatcher-3] [Cluster(akka://ClusterSystem)] Cluster Node [akka.tcp://ClusterSystem#127.0.0.1:2551] - Node [akka.tcp://ClusterSystem#127.0.0.1:2551] is JOINING, roles []
[INFO] [11/06/2013 17:48:42.942] [ClusterSystem-akka.actor.default-dispatcher-5] [akka://ClusterSystem/deadLetters] Message [akka.contrib.pattern.DistributedPubSubMediator$SubscribeAck] from Actor[akka://ClusterSystem/user/distributedPubSubMediator#1608017981] to Actor[akka://ClusterSystem/deadLetters] was not delivered. [1] dead letters encountered. This logging can be turned off or adjusted with configuration settings 'akka.log-dead-letters' and 'akka.log-dead-letters-during-shutdown'.
Would like to know why the actor is not properly subscribed ? I am expecting it to print SUBSCRIBED TO MESSAGES
The thing is that the SubscribeAck is sent to the sender of the Subscribe message and not the actor in the Subscribe message. To get the SubscribeAck sent to the metricsActor, it would have to send the Subscribe itself, and directly to the mediator.
The receptionist is used by the cluster client code, and you shouldn’t use that to subscribe your actors normally.