Not able to pull Private Images from Artifactory to AWS Fargate - amazon-web-services

I am having an issue pulling private images from Artifactory to AWS Fargate. It is showing an error "access violation". Anybody getting the same error while running task in AWS Fargate?
Status reason : CannotPullContainerError: API error (500): Get https://xxx.artifactory.xx:xxx/v2/: Access violation

This issue has been fixed in Fargate Platform Version‐1.3.0.
In 1.3.0, along with Secrets support, AWS fixed the issue of pulling images from the private registry which runs on HTTPS ports other than 443.
https://docs.aws.amazon.com/AmazonECS/latest/developerguide/platform_versions.html
https://aws.amazon.com/about-aws/whats-new/2018/12/aws-fargate-platform-version-1-3-adds-secrets-support/

AWS have recognized this as an issue with Fargate:
Thank you for your patience. Here's an update which I received from our Fargate team on this issue.
We have identified an issue were customers are unable to pull images from private container image registries running on non-443 HTTPS ports. We have fixed the issue and will make it available with the next AWS Fargate Platform Version release (mid December).
I hope the above answers your questions and I do apologize for the issues you faced. Please do get back on the case if you needed more information. Have a great day ahead :)
To retry, then, after mid-December

Related

Regarding the pod issue in AWS EKS

I have created a cluster in AWS EKS with 2 nodes, where it is a web based application and i am using EBS volumes on the Pods. I am able to communicate in initial login page with the server and can proceed with other steps as well. But sometimes for one particular module it is throwing an "internal server error:500" which i used to inspect option whether it is from the development code issue or from the AWS code issue.
Only for one particular Pod(service) the issue is coming and throwing an error whether it is from network related issue, which the communication is not happening or not. from the server the data is unable to pull or not unable to understand.

AWS EB can't talk to the app servers... what could block that

We had a working env yesterday morning. One person was working with some VPC peering stuff. And somehow it seems blocked the ELB from being able to talk to the application servers. So we can't deploy.
We get this error
Command failed on instance. An unexpected error has occurred [ErrorCode: 0000000001].
And EB can't retrieve any logs from the app server either. So clearly it is getting blocked. The IAM of the instance has AWSElasticBeanstalkWebTier and is the same IAM used in another env that works.
Also, I can RDP to the app instance from my laptop. Though I added my IP to one of the security groups for the instance. I audited the SG's for the working env, and don't see anything specific for EB.
Everything points to something with the VPC... but what should I look at?
So turns out that that error message doesn't always mean it can't talk to the instance. We had updated the AMI a while back and used the wrong one... it wasn't compatible with the platform. So nothing worked right, including getting the logs. So the fix was to sync up the AMI and the EB platform.

Why does AWS EC2 instance authentication keeps stop working (session maybe)?

Problem In Short: PHP website hosted on AWS EC2 session stops working and restarting the EC2 instance fixes it.
=================
I have this website built on custom PHP hosted on AWS as an instance of EC2, DB is MySql RDS. Everything was working just fine a few weeks back. Now at certain times, the user can't log in to the website. I reckon there's some problem with the session but not sure.
But whenever I restart the ec2 instance, the authentication (user/pass) for the website starts working again. This is very strange because.. there's is no such issues on AWS troubleshooting or in Stackoverflow. I hope there is a permanent solution rather than having to restart the server every other day.
This is an e-commerce site with at least 500 to 700 orders per day.
If an issue is resolved by the restart of server/EC2-instance, then most likely you have memory/CPU related problems in your application.
You can do the following to nail the issue
Trace the cloudwatch statistics for your EC2 instance for CPU and memory
Set up another environment for load testing your application
Generate load script to simulate the scenario leading to the problem
Run a code profiler to investigate the problematic code
Fix problems, run the load again to verify your changes
Apply the changes in prod and hope your application rocks in production afterward
EDIT : As suggested by #Boinst, as in interm solution you can schedule restart of EC2 instance, while you find the root cause. One of the ways to do that can be to use AWS CLI
aws ec2 reboot-instances --instance-ids yourInstanceId
you can add a cronjob/scheduled task a machine setup with AWS CLI.

AWS Fargate 503 Service Temporarily Unavailable

I'm trying to deploy backend application to the AWS Fargate using cloudformation templates that I found. When I was using the docker image training/webapp I was able to successfully deploy it and access with the externalUrl from the networking stack for the app.
When I try to deploy our backend image I can see the stacks are deploying correctly but when I try to go to the externalUrl I get 503 Service Temporarily Unavailable and I'm unable to see it... Another thing that I've noticed is on the docker hub I can see that the image is continuously pulled all the time when the cloudformation services are running...
The backend is some kind of maven project I don't know exactly what but I know that locally its working but to get it up running the container with this backend image takes about 8 minutes... I'm not sure if this affects the Fargate ?? Any Idea how to get it working ?
It sounds like you need to find the actual error that you're experiencing, the 503 isn't enough information. Can you provide some other context?
I'm not familiar with fargate but have been using ecs quite a bit this year and I generally would find that by going to (on the dashboard) ecs -> cluster -> service -> events. The events tab gives more specific errors as to what is happening.
My ecs deployment problems are generally summarized into
the container is not exposing the same port as is in the definition, this could be the case if you're deploying from a stack written by someone else.
the task definition memory/cpu restrictions don't grant enough space for the application and it has trouble placing (probably a problem with ecs more than fargate but you never know.)
Your timeout in the task definition is not set to 8 minutes: see this question, it has a lot of this covered
Your start command in the task definition does not work as expected with the container you're trying to deploy
If it is pulling from docker hub continuously my bet would be that it's 1, 3 or 4, and it's attempting to pull the image over and over again.
Try adding a Health check grace period of 60 by going to ECS -> cluster -> service -> update Network Access section.

S3 to EC2 problems

I´m trying to transfer a project from TeamCity to my EC2 server using CodeDeploy. In the process we had a problem and the file isn´t being transfered from the S3 service to our EC2 instance. The error message is :
The overall deployment failed because too many individual instances
failed deployment, too few healthy instances are available for
deployment, or some instances in your deployment group are
experiencing problems. (Error code: HEALTH_CONSTRAINTS)
The team decided that the best way to solve this problem was reading the server log, but in the process we noticed that the server keeps shutting down alone and that was a huge problem, we tought that can be solved with the logs, so we tried to get them using CodeWatch ( our team created the correct IAM and with a run command installed the agent on the server). Sadly this work only managed to get shutting down or turning information on logs.
At this moment we don´t know how to solve this problem but our plain was to get all the logs and then see what is wrong.
I´m stucked at this part since setember, can someone help me ?