BotoServerError: 400 Bad Request - While sending email from EC2 Ubuntu instance - django

I am using Django (Pyhton) framework deployed on AWS EC2 Ubuntu instance and sending email using BOTO and AWS SES service.
Earlier my script used to work.
But since few days I have encountered an error:
BotoServerError at /contact_us/
BotoServerError: 400 Bad Request
<ErrorResponse xmlns="http://ses.amazonaws.com/doc/2010-12-01/">
<Error>
<Type>Sender</Type>
<Code>RequestExpired</Code>
<Message>Request timestamp: Wed, 16 Mar 2016 16:57:21 GMT expired. It must be within 300 secs/ of server time.</Message>
</Error>
<RequestId>368a4b97-eb97-11e5-bf2d-8ff0675b134d</RequestId>
</ErrorResponse>
Exception Location: /usr/local/lib/python2.7/dist-packages/boto/ses/connection.py
in _handle_error, line 177
Server time: Wed, 16 Mar 2016 16:57:21 +0000
The SES is working on UTC and I have changed the time of EC2 to UTC as well.
Help me how to solve this issue.

Request timestamp: Wed, 16 Mar 2016 16:57:21 GMT expired. It
must be within 300 secs/ of server time.
Since you said it is not working for few days, it is most likely due to recent daylight savings time. And it is likely you are not running ntp to sync your clock.
Try this: sudo ntpdate pool.ntp.org
which will sync your system clock. If you want to make sure the time sync happens periodically, then start the NTP daemon:
sudo service ntp stop
sudo ntpdate -s pool.ntp.org
sudo service ntp start

Related

Apache TLS Handshakes Timeout after DHCP Lease Renewal

I'm trying to figure out why my HTTPS sites go down everytime my server's DHCP lease gets renewed.
It happens consistently, but HTTP sites continue to work just fine.
Restarting systemd-networkd brings the sites back, but until that happens the HTTPS sites are basically unreachable.
Any tips on where to look first?
The weird thing is these sites come back after the next DHCP lease renewal, then I lose connectivity on the next one, then it comes back, then I lose it, on and on.
This is what I see in syslog when it happens.
Apr 13 18:06:25 www-1 systemd-networkd[13973]: ens4: DHCP lease lost
Apr 13 18:06:25 www-1 systemd-networkd[13973]: ens4: DHCPv4 address 10.138.0.29/32 via 10.138.0.1
Apr 13 18:06:25 www-1 systemd-networkd[13973]: ens4: IPv6 successfully enabled
Apr 13 18:06:25 www-1 dbus-daemon[579]: [system] Activating via systemd: service name='org.freedesktop.hostname1' unit='dbus-org.freedesktop.hostname1.service' requested by ':1.231' (uid=101 pid=13973 comm="/lib/systemd/systemd-networkd " label="unconfined")
Apr 13 18:06:25 www-1 systemd-networkd[13973]: ens4: Configured
Apr 13 18:06:25 www-1 systemd[1]: Starting Hostname Service...
Apr 13 18:06:25 www-1 dbus-daemon[579]: [system] Successfully activated service 'org.freedesktop.hostname1'
Apr 13 18:06:25 www-1 systemd[1]: Started Hostname Service.
Apr 13 18:06:25 www-1 systemd-hostnamed[17589]: Changed host name to 'www-1.us-west1-b.c.camp-fire-259800.internal'
This issue seems to be related to the following:
https://moss.sh/name-resolution-issue-systemd-resolved/
and
https://github.com/systemd/systemd/issues/9243
I've disabled systemd-resolved and am using a static /etc/resolv.conf copied from /run/systemd/resolve/resolv.conf
For internal DNS I'm using a private Google DNS Zone.
Thanks.

Trouble with email setup for Wikimedia site

I'm using Google Cloud Engine, Bitnami, and Mailgun to set up a Mediawiki site (v1.33.1-1 on Debian 9). I'm very new to every one of things.
My Mailgun is properly set up and verified, and I'm following the documentation provided here: https://cloud.google.com/compute/docs/tutorials/sending-mail/using-mailgun
When I run:
echo 'Test passed.' | mail -s 'Test-Email' EMAIL#EXAMPLE.COM
And then:
tail -n 5 /var/log/syslog
These are my results:
root#bitnami-mediawiki-860c:~# tail -n 5 /var/log/syslog
Nov 15 03:58:39 bitnami-mediawiki-860c postfix/qmgr[13119]: 8E84FA13DA: from=<>, size=2918, nrcpt=1 (queue active)
Nov 15 03:58:39 bitnami-mediawiki-860c postfix/bounce[13144]: 7A557A13D9: sender non-delivery notification: 8E84FA13DA
Nov 15 03:58:39 bitnami-mediawiki-860c postfix/qmgr[13119]: 7A557A13D9: removed
Nov 15 03:58:39 bitnami-mediawiki-860c postfix/smtp[13142]: 8E84FA13DA: to=<root#bitnami-mediawiki-860c>, relay=none, delay=0.01, delays=0.01/0/0/0, dsn=
5.4.4, status=bounced (Host or domain name not found. Name service error for name=bitnami-mediawiki-860c type=AAAA: Host not found)
Nov 15 03:58:39 bitnami-mediawiki-860c postfix/qmgr[13119]: 8E84FA13DA: removed
Can anyone tell me how to fix this? Be specific if you can, as I'm beginning from nearly zero prior knowledge.
Google Cloud by default has always blocked the port 25, however, you can use different ports, i.e. 587 and 465.
Those ports should work to send mails from an VM instance, which could be the root cause for this to not being working as expected. It should work as mentioned on the comments with the port 2525.

Jenkins suddenly started failing to provision agents in Amazon EKS

We are using the Kubernetes plugin to provision agents in EKS and around 8:45 pm EST yesterday, with no apparent changes on our end (I'm the only admin, and I certainly wasn't doing anything then) we started getting issues with provisioning agents. I have rebooted the EKS node and the Jenkins master. I can confirm that kubectl works fine and lists 1 node running.
I'm suspecting something must have changed on the AWS side of things.
What's odd is that those ALPN errors don't show up anywhere else in our logs until just before this started happening. Google around, I see people saying to ignore these "info" messages because the Java version doesn't support ALPN, but the fact that it's complaining about "HTTP/2" makes me wonder if Amazon changed something on their end to be HTTP/2 only?
I know this might seem too specific for a SO question, but if something did change with AWS that broke compatibility, I think this would be the right place.
From the Jenkins log at around 8:45:
INFO: Docker Container Watchdog check has been completed
Aug 29, 2019 8:42:05 PM hudson.model.AsyncPeriodicWork$1 run
INFO: Finished DockerContainerWatchdog Asynchronous Periodic Work. 0 ms
Aug 29, 2019 8:45:04 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
INFO: Excess workload after pending Kubernetes agents: 1
Aug 29, 2019 8:45:04 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
INFO: Template for label eks: Kubernetes Pod Template
Aug 29, 2019 8:45:04 PM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Aug 29, 2019 8:45:04 PM hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
INFO: Started provisioning Kubernetes Pod Template from eks with 1 executors. Remaining excess workload: 0
Aug 29, 2019 8:45:14 PM hudson.slaves.NodeProvisioner$2 run
INFO: Kubernetes Pod Template provisioning successfully completed. We have now 3 computer(s)
Aug 29, 2019 8:45:14 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
INFO: Created Pod: jenkins-eks-39hfp in namespace jenkins
Aug 29, 2019 8:45:14 PM okhttp3.internal.platform.Platform log
INFO: ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Aug 29, 2019 8:45:14 PM io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1 onFailure
WARNING: Exec Failure: HTTP 403, Status: 403 -
java.net.ProtocolException: Expected HTTP 101 response but was '403 Forbidden'
at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Aug 29, 2019 8:45:14 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
WARNING: Error in provisioning; agent=KubernetesSlave name: jenkins-eks-39hfp, template=PodTemplate{inheritFrom='', name='jenkins-eks', namespace='jenkins', slaveConnectTimeout=300, label='eks', nodeSelector='', nodeUsageMode=NORMAL, workspaceVolume=EmptyDirWorkspaceVolume [memory=false], volumes=[HostPathVolume [mountPath=/var/run/docker.sock, hostPath=/var/run/docker.sock], EmptyDirVolume [mountPath=/tmp/build, memory=false]], containers=[ContainerTemplate{name='jnlp', image='infra-docker.artifactory.mycompany.io/jnlp-docker:latest', alwaysPullImage=true, workingDir='/home/jenkins/work', command='', args='-url http://jenkins.mycompany.io:8080 ${computer.jnlpmac} ${computer.name}', ttyEnabled=true, resourceRequestCpu='', resourceRequestMemory='', resourceLimitCpu='', resourceLimitMemory='', envVars=[KeyValueEnvVar [getValue()=/home/jenkins, getKey()=HOME]], livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe#2043f440}], envVars=[KeyValueEnvVar [getValue()=/tmp/build, getKey()=BUILDDIR]], imagePullSecrets=[org.csanchez.jenkins.plugins.kubernetes.PodImagePullSecret#40ba07e2]}
io.fabric8.kubernetes.client.KubernetesClientException:
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:198)
at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Aug 29, 2019 8:45:14 PM org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
INFO: Terminating Kubernetes instance for agent jenkins-eks-39hfp
Aug 29, 2019 8:45:14 PM hudson.init.impl.InstallUncaughtExceptionHandler$DefaultUncaughtExceptionHandler uncaughtException
SEVERE: A thread (OkHttp Dispatcher/255634) died unexpectedly due to an uncaught exception, this may leave your Jenkins in a bad way and is usually indicative of a bug in the code.
java.util.concurrent.RejectedExecutionException: Task java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask#2c315338 rejected from java.util.concurrent.ScheduledThreadPoolExecutor#2bddc643[Terminated, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0]
at java.util.concurrent.ThreadPoolExecutor$AbortPolicy.rejectedExecution(ThreadPoolExecutor.java:2063)
at java.util.concurrent.ThreadPoolExecutor.reject(ThreadPoolExecutor.java:830)
at java.util.concurrent.ScheduledThreadPoolExecutor.delayedExecute(ScheduledThreadPoolExecutor.java:326)
at java.util.concurrent.ScheduledThreadPoolExecutor.schedule(ScheduledThreadPoolExecutor.java:533)
at java.util.concurrent.ScheduledThreadPoolExecutor.submit(ScheduledThreadPoolExecutor.java:632)
at java.util.concurrent.Executors$DelegatedExecutorService.submit(Executors.java:678)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.scheduleReconnect(WatchConnectionManager.java:300)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager.access$800(WatchConnectionManager.java:48)
at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onFailure(WatchConnectionManager.java:213)
at okhttp3.internal.ws.RealWebSocket.failWebSocket(RealWebSocket.java:571)
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:198)
at okhttp3.RealCall$AsyncCall.execute(RealCall.java:206)
at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Ran into this today as AWS just pushed the update for the net/http golang CVE for K8s versions 1.12.x. That patch apparently broke the version of the Kubernetes plugin we were on. Updating to the latest version of the plugin 1.18.3 resolved the issue.
https://issues.jenkins-ci.org/browse/JENKINS-59000?page=com.atlassian.jira.plugin.system.issuetabpanels%3Achangehistory-tabpanel
Just realized my Kubernetes plugin had an update available. Applied that and it seems to be working fine now.
It's a bit difficult to see without direct troubleshooting, but the log points to Jenkins (through the Java API library) not being able to talk to the kube-apiserver because it has been denied.
It would check if you can still talk to the cluster with a previously working KUBECONFIG using the standard kubectl.
I speculate that the reason for the changed behavior could be to an automatic EKS upgrade of your minor version. For example, EKS recently released a patch (~08/30/19) to address CVE-2019-9512 and CVE-2019-9514.
PS. I don't think the issue is related to dropping the HTTP/2 connection.
Just update the latest version of Kubernetes plugin (1.18.3) and restart the Jenkins. This worked for me.

aws s3 time not synced, authentication failure using awsaccesskeyid

Had the issue of awsaccesskey and awssecretkey not authenticating,
aws s3 ls
gave
An error occurred (RequestTimeTooSkewed) when calling the ListBuckets operation: The difference between the request time and the current time is too large.
So, I tried syncing the time with my local time, which was incorrect. Even after the sync, the issue persisted.
I am in the region of ap-south-1 Mumbai my time was set correctly but the error still occurred.
I tried launching an instance and timedatectl gave this,
Local time: Sat 2018-09-08 08:25:06 UTC
Universal time: Sat 2018-09-08 08:25:06 UTC
RTC time: Sat 2018-09-08 08:25:05
Time zone: Etc/UTC (UTC, +0000)
Network time on: yes
NTP synchronized: yes
RTC in local TZ: no
The server is also in ap-south-1 so I dont get why the local time is (UTC, +0000)
Trying to set my system clock to a similar time (UTC, +0000) results in this,
Local time: Sat 2018-09-08 20:09:46 +00
Universal time: Sat 2018-09-08 20:09:46 UTC
RTC time: Sat 2018-09-08 20:09:46
Time zone: Atlantic/Azores (+00, +0000)
System clock synchronized: yes systemd-timesyncd.service active: no
RTC in local TZ: no
I've tried adjusting my machine's time to everything I can think of but still am unable to fix this error. I also chose to add servers from my region to ntpd.conf
server 3.in.pool.ntp.org
server 3.asia.pool.ntp.org
server 0.asia.pool.ntp.org
But this didn't help either.
Local Machine is running Ubuntu 18.04LTS, Instance is Ubuntu 16.04LTS.
Is there something I'm missing about this? Thanks in advance.
I don't know how or why this was caused, but changing the time manually-in the bios and going into system settings and entering the hours and minutes fixed it. Should've tried that first.
Thanks for the help.

AWS EC2 instance randomly rebooting several times a day

I have a t2.nano instance that often reboots several times a day, as shown in the last reboot log:
reboot system boot 3.13.0-74-generi Tue Sep 12 17:26 - 19:15 (01:49)
reboot system boot 3.13.0-74-generi Tue Sep 12 13:58 - 19:15 (05:17)
reboot system boot 3.13.0-74-generi Tue Sep 12 11:13 - 19:15 (08:02)
reboot system boot 3.13.0-74-generi Tue Sep 12 00:48 - 19:15 (18:27)
reboot system boot 3.13.0-74-generi Fri Sep 1 23:48 - 19:15 (10+19:27)
As you can see, the server was up and running for 10 days, until it randomly reboots. It then reboots a total of 4 times over the next few hours.
There is nothing in /var/log/syslog at the time of reboot. Initially the instance is running a web server, but after the first reboot, the web server is not configured to start back up automatically. Therefore, nothing is running on my server, yet the instance still reboots several more times.
What's going on here? Is it likely that I'm being hacked or there's a problem with Amazon's servers?
Reboots look to be taking place at 19:15
Do you have any scripts or cronjobs running that could be playing a part in it?
Try This
https://status.aws.amazon.com/
Reboots should be expected, but no more frequently than you'd expect them with commodity hardware