We have few microservice's which get deployed on Ec2 instance properly and run fine.
But few of the pods inconsistently keep on getting "The security token included in the request is expired" error when connecting to DynamoDB and SNS. surprisingly the DB connections to Aurora don't seem to be a problem from the same microservice.
These pods face this issue for a few minutes and then again start working properly on their own.
Even if we restart the POD, it starts to work fine.
Things we have tried:
RetryPolicy retryPolicy = new RetryPolicy(PredefinedRetryPolicies.DEFAULT_RETRY_CONDITION,
PredefinedRetryPolicies.DEFAULT_BACKOFF_STRATEGY, MAX_RETRY_COUNT_AWS_TOKEN_EXPIRED, true);
ClientConfiguration clientConfiguration = new ClientConfiguration().withRetryPolicy(retryPolicy);
return AmazonSNSClientBuilder.standard().withClientConfiguration(clientConfiguration).build();
Also, we are trying to retry the above block of code from exception catch block, to instantiate snsclient, if snsClient.publish fails with token expiry error, assuming it will create new snsClient where the security token would be new, but that does not work either.
From AWS SDK documentation, the IAM roles in EC2 are used by instance metadata service to get new STS token using temporary credentials just before they expire. But at times this does not seem to be working.
Below are my queries:
What could be the issue?
How to debug if the call of instance meta data service to AWS is failing from my ec2 instance? cloud trail is not showing anything.
We sometimes face DNS resolver issue in our eco system, can this be the cause ? does ec2 instance meta data service also use DNS resolver to connect to AWS to get new STS token?
95% PODS work well, 5% PODS fail with this issue for few mins in a week.
Please suggest.
Related
We have an AWS Org with AWS Grafana running in the root account setup with Org access.
We have successfully connected to AWS Prometheus and other data sources across different organization accounts. But cant get AWS Grafana to connect to Amazon OpenSearch that is hosted in a VPC.
If you look at Grafana -> AWS Data Sources -> Amazon OpenSearch Service, it lists the cluster. But all attempts to connect have failed.
We have tried setting:
Using SigV4auth Auth
Using Basic auth + With Credentials (Even adding VPC connections between accounts and checking ports are open
When we try Save and Test, we always get a Testing.. followed by OpenSearch error: Bad Gateway in grafana.
Has anyone got it working successfully and able to assist?
Same issue here. Except the Grafana is setup in the same account that the opensearch cluster.
Also tried to configure the security group on the open search cluster to accept everything (all port, all protocol from anywhere).
I'm wondering if it's a network issue : the opensearch cluster being in a VPC can grafana access it ? But I can't find documentation on the network part of the managed grafana.
Hope someone will help.
Been told it’s a known issue.
The solution is to create a proxy for your opensearch cluster and let it get internet access to connect to grafana.
No idea on timelines for AWS to build / fix the problem :(
A solution that works well on my side is to fill in the fields:
HTTP part:
URL: https://search-anything
Access: Server (default)
Auth part:
Check Basic auth
then in Basic Auth Details fill in the master username and password
OpenSearch details part:
fill in the name of an index
make sure that a timestamp field exists in the index filled above and put the name of this field in Time field name
choose the right OpenSearch version 1.0.x
Test
I hope this will help you
I'm using spring-cloud-starter-aws-jdbc to connect to an RDS instance. I initially went the traditional spring.datasource route, but I needed to make use of read-replicas and wanted to configure this without introducing any weird code.
The error I'm getting is:
Caused by: com.amazonaws.SdkClientException: Unable to load AWS credentials from any provider in the chain: [com.amazonaws.auth.EC2ContainerCredentialsProviderWrapper#47acd13b: Failed to connect to service endpoint: , com.amazonaws.auth.profile.ProfileCredentialsProvider#6f8e9d06: profile file cannot be null]
| at com.amazonaws.auth.AWSCredentialsProviderChain.getCredentials(AWSCredentialsProviderChain.java:136)
Initially I tried adding an AmazonRDS bean to my configuration and provided the credentials directly, but that wasn't good enough. I set a breakpoint inside getCredentials() and can see it being called twice: the first time there are 5 credential providers, one of which contains the AWS credentials I'm passing in via environment variables.
The second time, there are only two providers, neither of which contain my credentials, and so the app crashes. Has anyone ever used this library before and been successful? I can't figure out why it's fetching the credentials twice when I've already provided the RDS client and even tried providing the credentials with a bean.
When trying to connect to an AWS service via Boto3, I occasionally get the following error:
NoAuthHandlerFound: No handler was ready to authenticate. 1 handlers were checked. ['HmacAuthV3Handler'] Check your credentials
This is running on an EC2 instance with an IAM Role configured. This error happens rarely.
IAM roles provide credentials via the AWS metadata service. Boto3 will connect to this service to get credentials, but this connection can time out. By default, Boto3 will not retry connections to the metadata service, but this can be changed by setting the environment variable AWS_METADATA_SERVICE_NUM_ATTEMPTS to a number higher than 1.
See the docs:
AWS_METADATA_SERVICE_NUM_ATTEMPTS
When attempting to retrieve credentials on an EC2 instance that has been configured with an IAM role, boto3 will only make one attempt to retrieve credentials from the instance metadata service before giving up. If you know your code will be running on an EC2 instance, you can increase this value to make boto3 retry multiple times before giving up.
The official AWS documentation states that instance profile credentials "are temporary and would eventually expire", and I was wondering how often they expire.
I'am asking because I have applications using an InstanceProfileCredentialsProvider as credential provider, which by default does not refresh credentials, running for days without problems.
We have noticed from logging that the temporary credentials issued against an attached role last approximately 6 hours.
Does anyone know the mechanism of how they are refreshed, supposedly 15 minutes before they expire? Is the SSM service monitoring the expiration and asking for new credentials?
We are currently chasing down what appears to be an issue with the credentials not being refreshed after the EC2 instance has no activity on the overnight. Trying to determine whether app pool idle timeout or recycle interval is playing a hand.
I set up IAM authentication on an RDS instance, and I'm able to use IAM to get database passwords that work for 15-minutes. This is fine to access the database for backups, but this database backs an web application so currently after 15 minutes the password used by the app to connect to the DB becomes invalid and the app crashes as it can no longer access the DB.
However, in the RDS IAM docs there's this line:
For applications running on Amazon EC2, you can use EC2 instance profile credentials to access the database, so you don't need to use database passwords on your EC2 instance.
This implies that on EC2 there's no need to use the IAM temporary DB password, which would mean that my app should be able to connect to the DB as long as it's running on EC2 and I set up the role permissions (which I think I did correctly). However, I can't get my app running on EC2 to be able to connect to the RDS DB except by using the 15-minute temporary password. If I try connecting with a normal MySQL connection with no password I get permission denied. Is there something special that needs to be done to connect to RDS using the EC2 instance profile, or is it not possible without using 15-minute temporary passwords?
According to the documentation you linked (http://docs.aws.amazon.com/AmazonRDS/latest/UserGuide/UsingWithRDS.IAMDBAuth.html), you need to perform the following steps (See under "Authenticating to a DB Instance or DB Cluster Using IAM Database Authentication"):
Use the AWS SDK for Java or AWS CLI to get an authentication token you can use to identify the IAM user or role. To learn how to get an authentication token, see Getting an Authentication Token.
Connect to the database using an SSL connection, specifying the IAM user or role as the database user account and the authentication token as the password. For more information, see Connecting to a DB Instance or DB Cluster Using IAM Database Authentication.
That means for every connection you intend to open, you need to get a valid Token using the AWS SDK. This is where using the correct instance profile with the RDS permission is needed. See also the code examples further down the AWS documentation page.
I think however this requires quite a bit of effort on your side, to always get a valid token before opening a connection. It makes using an off-the-shelf connection pool difficult. Probably once open, the connection will remain open even after the token expires, but you still need to handle the case where more connections need to be opened at a later time.
I would stick with a normal user/password access for the application, using IAM for this case seems to be too much effort.
For applications running on Amazon EC2, you can use EC2 instance profile credentials to access the database, so you don't need to use database passwords on your EC2 instance.
You're misinterpreting what this means. It means you don't have to use static passwords or store them on the instance.
The idea is that you generate a new authentication token each time you establish a connection to the database. The token is generated on your instance, using the instance role credentials. It can only be used to authenticate for 15 minutes, but once connected, you don't lose your database connection after 15 minutes. You remain connected.
If your application doesn't reuse database connections, then you likely have a design flaw there.