Mailgun 499 Errors Occurring After Mail Appears To Be Delivered - mailgun

For some reason we are seeing 499 errors in the log that appear to be occurring after the email shows to be delivered.
We are trying to get to the bottom of what might be going on as we are seeing a significant number of 499 errors.
For example, what we have noticed in the log is that the timestamp shows a message was delivered at 2:39 p.m. with subsequent 499 errors being logged after 2:39 pm for the same message. It doesn't make sense to us based on the timestamp why the message would be delivered and then generate errors after delivery.
Any guidance you can provide would be greatly appreciated.
Thanks.

Related

Pub/Sub - unable to pull undelivered messages

There is an issue with my company's Pub/Sub. Some of our messages are stuck and the oldest unacked message age is increasing over time.
1 day charts:
and when I go to metrics explorer and select Expired ack deadlines count this is the one week chart.
I decided to find out why these messages are stuck, but when I ran the pull command (below), I got Listed 0 items response. It is therefore not possible to see them.
Is there a way how I can figure out why some of the messages are displayed as unacknowledged?
Also, the Unacked message count shows the same amount (around 2k) messages for the whole month, even though there are new messages published every day.
Here are the parameters we use for this subscription:
I tried to fix this error by setting the deadline to 600 seconds, but it didn't help.
Additionally, I want to mention that we use node.js Pub/Sub client library to handle the messages.
The most common causes of messages not being able to be pulled are:
The subscriber client already received the messages and "forgot" about them, perhaps due to an exception being thrown and not handled. In this case, the message will continue to be leased by the client until the deadline passes. The client libraries all extend the lease automatically until the maxExtension time is reached. If these are messages that are always forgotten, then it could be that they are redelivered to the subscriber and forgotten again, resulting in them not being pullable via the gcloud command-line tool or UI.
There could be a rogue subscriber. It could be that another subscriber is running somewhere for the same subscription and is "stealing" these messages. Sometimes this can be a test job or something that was used early on to see if the subscription works as expected and wasn't turned down.
You could be falling into the case of a large backlog of small messages. This should be fixed in more recent versions of the client library (v2.3.0 of the Node client has the fix).
The gcloud pubsub subscription pull command and UI are not guaranteed to return messages, even if there are some available to pull. Sometimes, rerunning the command multiple times in quick succession helps to pull messages.
The fact that you see expired ack deadlines likely points to 1, 2, or 3, so it is worth checking for those things. Otherwise, you should open a support case so the engineers can look more specifically at the backlog and determine where the messages are.

StoppedReason in ECS Fargate is truncated

In ECS Fargate, when a task fails, there is a "Stopped Reason" field which gives some useful logging. However I have noticed that it gets truncated after 255 symbols (screenshot below).
I checked the network tab and tracked the JSON of the http response, and it is truncated even there (so server-side). Is there any way to get the complete message?
I find this thread where they discuss the same problem.
How can I see the whole, untruncated error message?
I found the whole error message in CloudTrail eventually. I searched by "Username", and entered the Task GUID as username. This narrowed down the amount of events I had to sift through. The full error message was in a "GetParameters" event.
Just FYI for anyone who reads this answer the task GUID is the ID at the end of the taskArn or if you go to Task in the console it will be the ID that you see in Task : fc0398a94d8b4d4d9218a6d870914a80 –
Emmanuel N K
Jun 21 at 13:21

Facebook API calls rate limit reached

3 days ago we received an alert from the facebook developers page inform us that one of our apps had reached 100% of the hourly rate limit. Our application had an error that caused the increase in calls to the APIS that we solved yesterday afternoon. Since that we deployed the fix we see that in API calls graph (graph: "Application Level Rate Limiting") we don't reach the limit but the calls to the facebook APIS still failing. We want to know if there is a period of time to recover access to the APIs after not reaching that limit.
Here you can see a screenshot of the alert:
alert
In the response headers of one of the calls, we receive this error:
Status code: 403
Header name: WWW-Authenticate
Header value: OAuth "Facebook Platform" "invalid_request" "(#4) Application request limit reached
You can see the header here
You are not the only one right now:
https://developers.facebook.com/support/bugs/169774397034403/
But i suppose it should be gone after a day or a few hours, in my experience, sometimes i can make a few calls and then it shuts me off again, while our application is not that api call intensive.
This is the response from Facebook:
Dear all,
We checked with our rate limiting team who confirmed that several
improvements were made to help you troubleshoot rate limit related
error messages. For example, we've fixed an existing graph and added a
new one in the app dashboard to give you more info about the status of
your app.
If you continue to receive error code #4 in your request, we'd
appreciate it if you can create a new bug report because this thread
is getting rather long. We'll be happy to analyze each individual case
for you if you can provide the following info:
your app id the entire error message include the trace id a screenshot
of the graphs on your app dashboard
Thanks for your patience while we looked into this.
Xiao

Resending Notification on error in Error Reporting

This is regarding re-sending of notifications on error of same kind.
In my current project, my errors are being grouped.
Like for eg: If it is an sql error for first time, I receive a notification but when it occurs after 2 or 3 hours it is grouped under same log and 'no notification is sent'.
On what basis does error reporting group the erorrs ?
I tried to randomise the error message in order to distinguish messages but still they are being grouped under the same category. (For eg: messages be like - service unavailable - 12, service unavailable - 23 etc.. )
I want to receive notification for each and every error irrespective of its type or repitition.
Suggest a solution ?
What you're describing is alerting based on a log based metric: https://cloud.google.com/logging/docs/logs-based-metrics/charts-and-alerts#creating_a_simple_alerting_policy_on_a_counter_metric

CFMail Issues since Upgrading to CF10

Ever since upgrading to CF10, we've been having some odd issues with our automated ColdFusion emails. The processes always functioned properly in the past, but lately we've been getting some very out of the ordinary issues which I'll describe further below.
We discover the problem usually from contacts who usually receive these emails on a daily basis with or without attachments. We'll go to the CFMAIL directory for the corresponding server and find a slew of emails stuck in the 'Undelivr' emails. In some cases, we can just move these emails to the Spool folder and they process fine, but in most cases they result in one of the two errors below:
Error 1: In an email which normally does not contain a body and contains an attachment, the follow error is what we found in the logs:
"Error","scheduler-1","01/15/13","14:09:56",,"javax.mail.MessagingExce ption: missing body for message"
javax.mail.MessagingException: missing body for message
at coldfusion.mail.MailImpl.createMessage(MailImpl.java:696)
at coldfusion.mail.MailSpooler.deliver(MailSpooler.java:1295)
at coldfusion.mail.MailSpooler.sendMail(MailSpooler.java:1197)
at coldfusion.mail.MailSpooler.deliverFast(MailSpooler.java:1657)
at coldfusion.mail.MailSpooler.run(MailSpooler.java:1567)
at coldfusion.scheduling.ThreadPool.run(ThreadPool.java:211)
at coldfusion.scheduling.WorkerThread.run(WorkerThread.java:71)
Placing these emails that have always been sent out this way in the past without an attachment in the spool directory causes it to go right back in the 'Undelivr' folder and resulting in the same error. We ended up having to modify the email file and add random content in the body message, place it back in the spool directory, and it went through. - Mind boggling.
Error 2:
"Error","scheduler-2","02/04/13","09:08:17",,"javax.mail.MessagingExce ption: Exception reading response; nested exception is: java.net.SocketException: Connection reset"
Both errors occur randomly and we have not been able to find out what causes them randomly from time to time. All other emails go through fine, but certain emails will never go out and end up in the 'Undelivr' folder.
We are running them on Windows Server 2008 64bit.
I was facing second error connection reset couple of week ago but that was in CF9 and with SSL only. Here is blog post if that help
http://www.isummation.com/blog/getting-javaxmailmessagingexception-could-not-connect-to-smtp-host-xxxxxxx-port-465-response-1-error-in-coldfusion/