Unable to push docker image to ECR repo - amazon-ecr

[ec2-user#ip-172-31-93-123 docker]$ docker push xxxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/hello-repository:latest
The push refers to repository [0xxxxxxxxx.dkr.ecr.us-east-1.amazonaws.com/hello-repository]
cf125602020d: Retrying in 1 second
dab19ae02f77: Retrying in 1 second
1ba30a635762: Retrying in 1 second
4a641e21953d: Retrying in 1 second
EOF
Given all set of permission below but still not able to push
AmazonEC2ContainerRegistryFullAccess
AmazonEC2ContainerRegistryPowerUser
AmazonECSTaskExecutionRolePolicy
AmazonECS_FullAccess
AmazonElasticContainerRegistryPublicFullAccess
AmazonElasticContainerRegistryPublicPowerUser
Can anyone help to resolve this.

Related

Error on connecting pod on AWS machine from JProfiler 13.02

I try connect to my pod on AWS machine from JProfiler 13.02.
Quick Attach -> On a Kubernetes Cluster-> Kubectl on another computer->give SSH connection details and press start
I get a list of all pods, select container of one of them and click OK. I get an error:
An exception occurred while connecting to the selected container.
The error message was: java.io.IOException: Could not copy agent to docker container tar: .jprofiler13/agent/13094_13.0.2: Cannot open: No such file or directory tar: Error is not recoverable: exiting now tar: short read command terminated with exit code 1 exit code: 1

kubectl apply results in ImagePullBackOff

I'm getting the following error on kubectl apply -f https://k8s.io/examples/pods/simple-pod.yaml
Warning Failed 4s (x2 over 17s) kubelet Failed to pull image "nginx:1.14.2": rpc error: code = Unknown desc = failed to pull and unpack image "docker.io/library/nginx:1.14.2": failed to resolve reference "docker.io/library/nginx:1.14.2": failed to do request: Head "https://registry-1.docker.io/v2/library/nginx/manifests/1.14.2": x509: certificate signed by unknown authority
I've added all the root CA to /etc/ssl/certs/ca-certificates.crt on Ubuntu, yet the above kubectl command is stil failing. Is there any specfic location kubectl looks for the root CA certs ?
Any help is appreciated!
EDIT:
Okay, I got confused, it is basically my worker nodes that can't pull the image, not my local environment ... so basically I need to find a way to insert the root CA in every worker node. But still, wondering, why did the install of AKS not fail in the first place ...

Unable to start the Amazon SSM Agent - failed to start message bus

When registering an Amazon SSM Agent, it registers successfully in the SSM Managed Instances console, but the connection shows "Connection Lost".
When I try to start the service manually, I get the following error:
Error occurred fetching the seelog config file path: open /etc/amazon/ssm/seelog.xml: no such file or directory
Initializing new seelog logger
New Seelog Logger Creation Complete
2020-12-09 10:20:01 ERROR error occurred when starting amazon-ssm-agent: failed to start message bus, failed to start health channel: failed to listen on the channel: ipc:///var/lib/amazon/ssm/ipc/health, address in use
How exactly do I solve this? I've tried to restart the service a few times but no luck.
I was able to fix this issue by stopping the agent and purging the /var/lib/amazon/ssm/ipc directory
service amazon-ssm-agent stop
rm -rf /var/lib/amazon/ssm/ipc
service amazon-ssm-agent start

CockroachDB on AWS EKS cluster - [n?] no stores bootstrapped

I am attempting to deploy CockroachDB:v2.1.6 to a new AWS EKS cluster. Everything is deployed successfully; statefulset, services, pv's & pvc's are created. The AWS EBS volumes are created successfully too.
The issue is the pods never get to a READY state.
pod/cockroachdb-0 0/1 Running 0 14m
pod/cockroachdb-1 0/1 Running 0 14m
pod/cockroachdb-2 0/1 Running 0 14m
If I 'describe' the pods I get the following:
Normal Pulled 46s kubelet, ip-10-5-109-70.eu-central-1.compute.internal Container image "cockroachdb/cockroach:v2.1.6" already present on machine
Normal Created 46s kubelet, ip-10-5-109-70.eu-central-1.compute.internal Created container cockroachdb
Normal Started 46s kubelet, ip-10-5-109-70.eu-central-1.compute.internal Started container cockroachdb
Warning Unhealthy 1s (x8 over 36s) kubelet, ip-10-5-109-70.eu-central-1.compute.internal Readiness probe failed: HTTP probe failed with statuscode: 503
If I examine the logs of a pod I see this:
I200409 11:45:18.073666 14 server/server.go:1403 [n?] no stores bootstrapped and --join flag specified, awaiting init command.
W200409 11:45:18.076826 87 vendor/google.golang.org/grpc/clientconn.go:1293 grpc: addrConn.createTransport failed to connect to {cockroachdb-0.cockroachdb:26257 0 <nil>}. Err :connection error: desc = "transport: Error while dialing dial tcp: lookup cockroachdb-0.cockroachdb on 172.20.0.10:53: no such host". Reconnecting...
W200409 11:45:18.076942 21 gossip/client.go:123 [n?] failed to start gossip client to cockroachdb-0.cockroachdb:26257: initial connection heartbeat failed: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup cockroachdb-0.cockroachdb on 172.20.0.10:53: no such host"
I came across this comment from the CockroachDB forum (https://forum.cockroachlabs.com/t/http-probe-failed-with-statuscode-503/2043/6)
Both the cockroach_out.log and cockroach_output1.log files you sent me (corresponding to mycockroach-cockroachdb-0 and mycockroach-cockroachdb-2) print out no stores bootstrapped during startup and prefix all their log lines with n?, indicating that they haven’t been allocated a node ID. I’d say that they may have never been properly initialized as part of the cluster.
I have deleted everything including pv's, pvc's & AWS EBS volumes through the kubectl delete command and reapplied with the same issue.
Any thoughts would be very much appreciated. Thank you
I was not aware that you had to initialize the CockroachDB cluster after creating it. I did the following to resolve my issue:
kubectl exec -it cockroachdb-0 -n /bin/sh
/cockroach/cockroach init
See here for more details - https://www.cockroachlabs.com/docs/v19.2/cockroach-init.html
After this the pods started running correctly.

I have started to receive a 402 error when accessing my CoreOS cluster

I have started to receive a 402 error when accessing my CoreOS cluster. It has been working fine up until a day ago. Anybody has any ideas why I'm receiving this error? I am using the stable channel on EC2.
$ fleetctl list-machines
E0929 09:43:14.823081 00979 fleetctl.go:151] error attempting to check latest fleet version in Registry: 402: Standby Internal Error () [0]
Error retrieving list of active machines: 402: Standby Internal Error () [0]
In this case etcd does not currently have quorum. The "Standby Internal Error" signifies that the node is attempting to act as a standby but is failing to redirect you to the active node. Repairing the etcd issue will fix the problem. Check on the status of etcd by running:
journalctl -u etcd.service on each of the nodes should give you the information that you need to repair etcd in this case.