I'm trying to deploy a simple REST API written in Golang to AWS EKS.
I created an EKS cluster on AWS using Terraform and applied the AWS load balancer controller Helm chart to it.
All resources in the cluster look like:
NAMESPACE NAME READY STATUS RESTARTS AGE
kube-system pod/aws-load-balancer-controller-5947f7c854-fgwk2 1/1 Running 0 75m
kube-system pod/aws-load-balancer-controller-5947f7c854-gkttb 1/1 Running 0 75m
kube-system pod/aws-node-dfc7r 1/1 Running 0 120m
kube-system pod/aws-node-hpn4z 1/1 Running 0 120m
kube-system pod/aws-node-s6mng 1/1 Running 0 120m
kube-system pod/coredns-66cb55d4f4-5l7vm 1/1 Running 0 127m
kube-system pod/coredns-66cb55d4f4-frk6p 1/1 Running 0 127m
kube-system pod/kube-proxy-6ndf5 1/1 Running 0 120m
kube-system pod/kube-proxy-s95qk 1/1 Running 0 120m
kube-system pod/kube-proxy-vdrdd 1/1 Running 0 120m
NAMESPACE NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
default service/kubernetes ClusterIP 10.100.0.1 <none> 443/TCP 127m
kube-system service/aws-load-balancer-webhook-service ClusterIP 10.100.202.90 <none> 443/TCP 75m
kube-system service/kube-dns ClusterIP 10.100.0.10 <none> 53/UDP,53/TCP 127m
NAMESPACE NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
kube-system daemonset.apps/aws-node 3 3 3 3 3 <none> 127m
kube-system daemonset.apps/kube-proxy 3 3 3 3 3 <none> 127m
NAMESPACE NAME READY UP-TO-DATE AVAILABLE AGE
kube-system deployment.apps/aws-load-balancer-controller 2/2 2 2 75m
kube-system deployment.apps/coredns 2/2 2 2 127m
NAMESPACE NAME DESIRED CURRENT READY AGE
kube-system replicaset.apps/aws-load-balancer-controller-5947f7c854 2 2 2 75m
kube-system replicaset.apps/coredns-66cb55d4f4 2 2 2 127m
I can run the application locally with Go and with Docker. But releasing this on AWS EKS always throws CrashLoopBackOff.
Running kubectl describe pod PODNAME shows:
Name: go-api-55d74b9546-dkk9g
Namespace: default
Priority: 0
Node: ip-172-16-1-191.ec2.internal/172.16.1.191
Start Time: Tue, 15 Mar 2022 07:04:08 -0700
Labels: app=go-api
pod-template-hash=55d74b9546
Annotations: kubernetes.io/psp: eks.privileged
Status: Running
IP: 172.16.1.195
IPs:
IP: 172.16.1.195
Controlled By: ReplicaSet/go-api-55d74b9546
Containers:
go-api:
Container ID: docker://a4bc07b60c85fd308157d967d2d0d688d8eeccfe4c829102eb929ca82fb25595
Image: saurabhmish/golang-hello:latest
Image ID: docker-pullable://saurabhmish/golang-hello#sha256:f79a495ad17710b569136f611ae3c8191173400e2cbb9cfe416e75e2af6f7874
Port: 3000/TCP
Host Port: 0/TCP
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Tue, 15 Mar 2022 07:09:50 -0700
Finished: Tue, 15 Mar 2022 07:09:50 -0700
Ready: False
Restart Count: 6
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-jt4gp (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
kube-api-access-jt4gp:
Type: Projected (a volume that contains injected data from multiple sources)
TokenExpirationSeconds: 3607
ConfigMapName: kube-root-ca.crt
ConfigMapOptional: <nil>
DownwardAPI: true
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 7m31s default-scheduler Successfully assigned default/go-api-55d74b9546-dkk9g to ip-172-16-1-191.ec2.internal
Normal Pulled 7m17s kubelet Successfully pulled image "saurabhmish/golang-hello:latest" in 12.77458991s
Normal Pulled 7m16s kubelet Successfully pulled image "saurabhmish/golang-hello:latest" in 110.127771ms
Normal Pulled 7m3s kubelet Successfully pulled image "saurabhmish/golang-hello:latest" in 109.617419ms
Normal Created 6m37s (x4 over 7m17s) kubelet Created container go-api
Normal Started 6m37s (x4 over 7m17s) kubelet Started container go-api
Normal Pulled 6m37s kubelet Successfully pulled image "saurabhmish/golang-hello:latest" in 218.952336ms
Normal Pulling 5m56s (x5 over 7m30s) kubelet Pulling image "saurabhmish/golang-hello:latest"
Normal Pulled 5m56s kubelet Successfully pulled image "saurabhmish/golang-hello:latest" in 108.105083ms
Warning BackOff 2m28s (x24 over 7m15s) kubelet Back-off restarting failed container
Running kubectl logs PODNAME and kubectl logs PODNAME -c go-api shows standard_init_linux.go:228: exec user process caused: exec format error
Manifests:
go-deploy.yaml ( This is the Docker Hub Image with documentation )
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: go-api
labels:
app: go-api
spec:
replicas: 2
selector:
matchLabels:
app: go-api
strategy: {}
template:
metadata:
labels:
app: go-api
spec:
containers:
- name: go-api
image: saurabhmish/golang-hello:latest
ports:
- containerPort: 3000
resources: {}
go-service.yaml
---
kind: Service
apiVersion: v1
metadata:
name: go-api
spec:
selector:
app: go-api
type: NodePort
ports:
- protocol: TCP
port: 80
targetPort: 3000
How can I fix this error ?
Posting this as Community wiki for better visibility.
Feel free to expand it.
Thanks to #David Maze, who pointed to the solution. There is an article 'Build Intel64-compatible Docker images from Mac M1 (ARM)' (by Beppe Catanese) here.
This article describes the underlying problem well.
You are developing/building on the ARM architecture (Mac M1), but you deploy the docker image to a x86-64 architecture based Kubernetes cluster.
Solution:
Option A: use buildx
Buildx is a Docker plugin that allows, amongst other features, to build images for various target platforms.
$ docker buildx build --platform linux/amd64 -t myapp .
Option B: set DOCKER_DEFAULT_PLATFORM
The DOCKER_DEFAULT_PLATFORM environment variable permits to set the default platform for the commands that take the --platform flag.
export DOCKER_DEFAULT_PLATFORM=linux/amd64
A CrashloopBackOff means that you have a pod starting, crashing, starting again, and then crashing again.
Maybe the error come from the application itself that it can not connect to database, redis,...
You may find something useful here:
My kubernetes pods keep crashing with "CrashLoopBackOff" but I can't find any log
Related
I have a cluster and node creates in AWS EKS. I applied the deployment to that cluster as under
kubectl apply -f deployment.yaml
Where deployment.yaml contains the containers' specification along with DockerHub repo and image
However, I did a mistake in deployment.yaml and I need to re-apply it to the configuration
My question is:
1 - How do I reapply a deployment.yaml to the AWS EKS cluster using kubectl?
Just running the above command is not working (kubectl apply -f deployment.yaml)
2- After I re-apply the deployment.yaml , will the node will go an pick up the DockerHub image or do I still need to do something else( supposing all the other details are ok)
Some outputs below:
>> kubectl get pods
my-app-786dc95d8f-b6w4h 0/1 ImagePullBackOff 0 9h
my-app-786dc95d8f-w8hkg 0/1 ImagePullBackOff 0 9h
kubectl describe pod my-app-786dc95d8f-b6w4h
Name: my-app-786dc95d8f-b6w4h
Namespace: default
Priority: 0
Node: ip-192-168-24-13.ec2.internal/192.168.24.13
Start Time: Fri, 10 Jul 2020 12:54:38 -0400
Labels: app=my-app
pod-template-hash=786dc95d8f
Annotations: kubernetes.io/psp: eks.privileged
Status: Pending
IP: 192.168.7.235
IPs:
IP: 192.168.7.235
Controlled By: ReplicaSet/my-app-786dc95d8f
Containers:
simple-node:
Container ID:
Image: BAD_REPO/simple-node
Image ID:
Port: 80/TCP
Host Port: 0/TCP
State: Waiting
Reason: ImagePullBackOff
Ready: False
Restart Count: 0
Environment: <none>
Mounts:
/var/run/secrets/kubernetes.io/serviceaccount from default-token-mwwvl (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
default-token-mwwvl:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-mwwvl
Optional: false
QoS Class: BestEffort
Node-Selectors: <none>
Tolerations: node.kubernetes.io/not-ready:NoExecute for 300s
node.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal BackOff 17m (x2570 over 9h) kubelet, ip-192-168-24-13.ec2.internal Back-off pulling image "BAD_REPO/simple-node"
Warning Failed 2m48s (x2634 over 9h) kubelet, ip-192-168-24-13.ec2.internal Error: ImagePullBackOff
BR
if you need to change image:
kubectl set image deployment.v1.apps/{your_deployment_name} image_name:tag
but you always can do
kubectl delete -f deployment.yaml
kubectl create -f deployment.yaml
since your image is in ImagePullBackOff - it doesn't work anyway and you can just recreate deployment. Usually you don't do drop/create on prod. that is why i am using image change all the time. just have to change tag on every new image.
ImagePullBackOff means that kubernetes is not able to pull the image.
Specially, the service account "default" is not able to pull the image.
To fix this issue, you need two checks:
Check that you don't have typo in the image name and tag. And that image is available publically.
If the Docker registry is private, make sure to create secret with dockerlogin type, and then patch the service account "default" by this secret.
I have created simple nginx deplopyment in Ubuntu EC2 instance and exposed to port through service in kubernetes cluster, but I am unable to ping the pods even in local envirnoment. My Pods are running fine and service is also created successfully. I am sharing some outputs of commands below
kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-31-39-226 Ready <none> 2d19h v1.16.1
master-node Ready master 2d20h v1.16.1
kubectl get po -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-54f57cf6bf-dqt5v 1/1 Running 0 101m 192.168.39.17 ip-172-31-39-226 <none> <none>
nginx-deployment-54f57cf6bf-gh4fz 1/1 Running 0 101m 192.168.39.16 ip-172-31-39-226 <none> <none>
sample-nginx-857ffdb4f4-2rcvt 1/1 Running 0 20m 192.168.39.18 ip-172-31-39-226 <none> <none>
sample-nginx-857ffdb4f4-tjh82 1/1 Running 0 20m 192.168.39.19 ip-172-31-39-226 <none> <none>
kubectl get svc
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.96.0.1 <none> 443/TCP 2d20h
nginx-deployment NodePort 10.101.133.21 <none> 80:31165/TCP 50m
sample-nginx LoadBalancer 10.100.77.31 <pending> 80:31854/TCP 19m
kubectl describe deployment nginx-deployment
Name: nginx-deployment
Namespace: default
CreationTimestamp: Mon, 14 Oct 2019 06:28:13 +0000
Labels: <none>
Annotations: deployment.kubernetes.io/revision: 1
kubectl.kubernetes.io/last-applied-configuration:
{"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"name":"nginx-deployment","namespace":"default"},"spec":{"replica...
Selector: app=nginx
Replicas: 2 desired | 2 updated | 2 total | 2 available | 0 unavailable
StrategyType: RollingUpdate
MinReadySeconds: 0
RollingUpdateStrategy: 25% max unavailable, 25% max surge
Pod Template:
Labels: app=nginx
Containers:
nginx:
Image: nginx:1.7.9
Port: 80/TCP
Host Port: 0/TCP
Environment: <none>
Mounts: <none>
Volumes: <none>
Conditions:
Type Status Reason
---- ------ ------
Available True MinimumReplicasAvailable
Progressing True NewReplicaSetAvailable
OldReplicaSets: <none>
NewReplicaSet: nginx-deployment-54f57cf6bf (2/2 replicas created)
Events: <none>
Now I am unable to ping 192.168.39.17/16/18/19 from master, also not able to access curl 172.31.39.226:31165/31854 from master as well. Any help will be highly appreciated..
From the information, you have provided. And from the discussion we had the worker node has the Nginx pod running. And you have attached a NodePort Service and Load balancer Service to it.
The only thing which is missing here is the server from which you are trying to access this.
So, I tried to reach this URL 52.201.242.84:31165. I think all you need to do is whitelist this port for public access or the IP. This can be done via security group for the worker node EC2.
Now the URL above is constructed from the public IP of the worker node plus(+) the NodePort svc which is attached. Thus here is a simple formula you can use to get the exact address of the pod running.
Pod Access URL = Public IP of Worker Node + The NodePort
I am unable to to deploy nginx containers using kubectl to AWS Fargate using virtual-kubelet. I am following this guide: https://aws.amazon.com/blogs/opensource/aws-fargate-virtual-kubelet/.
I am having an issue with Step 6: Create Kubernetes objects.
I would like to know why the nginx containers are PENDING and why the AWS Fargate task definitions have not been created.
The following is some of my commands I used. I can give more detail upon request.
# ./virtual-kubelet --provider aws --provider-config fargate.toml
...
2019/05/16 06:50:24 Received NodeDaemonEndpoints request.
ERRO[0000] TLS certificates not provided, not setting up pod http server certPath= keyPath= node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
INFO[0000] Initialized node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
INFO[0000] Created node node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
INFO[0000] Node leases not supported, falling back to only node status updates node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
INFO[0000] Pod cache in-sync node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
2019/05/16 06:50:25 Received GetPods request.
2019/05/16 06:50:25 Responding to GetPods: [].
INFO[0000] starting workers node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
INFO[0000] started workers node=virtual-kubelet operatingSystem=Linux provider=aws watchedNamespace=
# kubectl describe node virtual-kubelet
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedToCreateRoute 98s (x951 over 160m) route_controller (combined from similar events): Could not create route e1e32758-77a6-11e9-a68e-0a95bb07bfa2 100.96.4.0/24 for node virtual-kubelet after 47.871544ms: instance not found
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
ip-172-20-47-10.eu-west-2.compute.internal Ready master 30h v1.14.1
ip-172-20-47-242.eu-west-2.compute.internal Ready node 30h v1.14.1
ip-172-20-59-102.eu-west-2.compute.internal Ready node 30h v1.14.1
virtual-kubelet Ready agent 33m v1.13.1-vk-v0.9.0-40-g5b3190ac-dev
kubectl create -f nginx-deployment.yaml
# kubectl get deployments -o wide
# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-c6695csfc-5f7bh 0/1 Pending 0 21m <none> <none> <none> <none>
nginx-deployment-c6695csfc-bwfb8 0/1 Pending 0 21m <none> <none> <none> <none>
nginx-deployment-c6695csfc-mcfvw 0/1 Pending 0 21m <none> <none> <none> <none>
# kubectl describe pod nginx-deployment-c6695csfc-5f7bh
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 2m11s (x191 over 22m) default-scheduler 0/4 nodes are available: 1 Insufficient cpu, 1 node(s) had taints that the pod didn't tolerate, 3 node(s) didn't match node selector.
Update:
I then ran the command to add the nodeSelector to my nodes using the following command for each node:
kubectl label nodes ip-172-20-47-15.eu-west-2.compute.internal type=virtual-kubelet
type=virtual-kubelet is the nodeSelector specified in the manifest file, nginx-deployment.yaml.
# kubectl get pods -o wide
NAME READY STATUS RESTARTS AGE IP NODE NOMINATED NODE READINESS GATES
nginx-deployment-c6695csfc-5f7bh 1/1 Running 0 4m59s 100.96.2.7 ip-172-20-47-242.eu-west-2.compute.internal <none> <none>
nginx-deployment-c6695csfc-bwfb8 1/1 Running 0 4m59s 100.96.1.6 ip-172-20-59-102.eu-west-2.compute.internal <none> <none>
nginx-deployment-c6695csfc-mcfvw 1/1 Running 0 4m59s 100.96.2.8 ip-172-20-47-242.eu-west-2.compute.internal <none>
Now when I go to the AWS Fargate Dashboard the associated task definitions are not created as shown in the tutorial.
This issue is resolved. I was able to create the AWS Fargate definitions by adding the ALB Security group to the fargate.toml file and by adding tolerations to the nginx.deployment.yaml file as shown below:
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-deployment
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
tolerations:
- key: virtual-kubelet.io/provider
operator: Equal
value: azure
effect: NoSchedule
Use kops install k8s cluster on AWS.
Use Helm installed Prometheus:
$ helm install stable/prometheus \
--set server.persistentVolume.enabled=false \
--set alertmanager.persistentVolume.enabled=false
Then followed this note to do port-forward:
Get the Prometheus server URL by running these commands in the same shell:
export POD_NAME=$(kubectl get pods --namespace default -l "app=prometheus,component=server" -o jsonpath="{.items[0].metadata.name}")
kubectl --namespace default port-forward $POD_NAME 9090
My EC2 instance public IP on AWS is 12.29.43.14(not true). When I tried to access it from browser:
http://12.29.43.14:9090
Can't access the page. Why?
Another issue, after installed prometheus chart, the alertmanager pod didn't run:
ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 1/2 CrashLoopBackOff 1 9s
ungaged-woodpecker-prometheus-kube-state-metrics-5fd97698cktsj5 1/1 Running 0 9s
ungaged-woodpecker-prometheus-node-exporter-45jtn 1/1 Running 0 9s
ungaged-woodpecker-prometheus-node-exporter-ztj9w 1/1 Running 0 9s
ungaged-woodpecker-prometheus-pushgateway-57b67c7575-c868b 0/1 Running 0 9s
ungaged-woodpecker-prometheus-server-7f858db57-w5h2j 1/2 Running 0 9s
Check pod details:
$ kubectl describe po ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
Name: ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4
Namespace: default
Node: ip-100.200.0.1.ap-northeast-1.compute.internal/100.200.0.1
Start Time: Fri, 26 Jan 2018 02:45:10 +0000
Labels: app=prometheus
component=alertmanager
pod-template-hash=2959465499
release=ungaged-woodpecker
Annotations: kubernetes.io/created-by={"kind":"SerializedReference","apiVersion":"v1","reference":{"kind":"ReplicaSet","namespace":"default","name":"ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff","uid":"ec...
kubernetes.io/limit-ranger=LimitRanger plugin set: cpu request for container prometheus-alertmanager; cpu request for container prometheus-alertmanager-configmap-reload
Status: Running
IP: 100.96.6.91
Created By: ReplicaSet/ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff
Controlled By: ReplicaSet/ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff
Containers:
prometheus-alertmanager:
Container ID: docker://e9fe9d7bd4f78354f2c072d426fa935d955e0d6748c4ab67ebdb84b51b32d720
Image: prom/alertmanager:v0.9.1
Image ID: docker-pullable://prom/alertmanager#sha256:ed926b227327eecfa61a9703702c9b16fc7fe95b69e22baa656d93cfbe098320
Port: 9093/TCP
Args:
--config.file=/etc/config/alertmanager.yml
--storage.path=/data
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Fri, 26 Jan 2018 02:45:26 +0000
Finished: Fri, 26 Jan 2018 02:45:26 +0000
Ready: False
Restart Count: 2
Requests:
cpu: 100m
Readiness: http-get http://:9093/%23/status delay=30s timeout=30s period=10s #success=1 #failure=3
Environment: <none>
Mounts:
/data from storage-volume (rw)
/etc/config from config-volume (rw)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wppzm (ro)
prometheus-alertmanager-configmap-reload:
Container ID: docker://9320a0f157aeee7c3947027667aa6a2e00728d7156520c19daec7f59c1bf6534
Image: jimmidyson/configmap-reload:v0.1
Image ID: docker-pullable://jimmidyson/configmap-reload#sha256:2d40c2eaa6f435b2511d0cfc5f6c0a681eeb2eaa455a5d5ac25f88ce5139986e
Port: <none>
Args:
--volume-dir=/etc/config
--webhook-url=http://localhost:9093/-/reload
State: Running
Started: Fri, 26 Jan 2018 02:45:11 +0000
Ready: True
Restart Count: 0
Requests:
cpu: 100m
Environment: <none>
Mounts:
/etc/config from config-volume (ro)
/var/run/secrets/kubernetes.io/serviceaccount from default-token-wppzm (ro)
Conditions:
Type Status
Initialized True
Ready False
PodScheduled True
Volumes:
config-volume:
Type: ConfigMap (a volume populated by a ConfigMap)
Name: ungaged-woodpecker-prometheus-alertmanager
Optional: false
storage-volume:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
default-token-wppzm:
Type: Secret (a volume populated by a Secret)
SecretName: default-token-wppzm
Optional: false
QoS Class: Burstable
Node-Selectors: <none>
Tolerations: node.alpha.kubernetes.io/notReady:NoExecute for 300s
node.alpha.kubernetes.io/unreachable:NoExecute for 300s
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 34s default-scheduler Successfully assigned ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 to ip-100.200.0.1.ap-northeast-1.compute.internal
Normal SuccessfulMountVolume 34s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal MountVolume.SetUp succeeded for volume "storage-volume"
Normal SuccessfulMountVolume 34s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal MountVolume.SetUp succeeded for volume "config-volume"
Normal SuccessfulMountVolume 34s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal MountVolume.SetUp succeeded for volume "default-token-wppzm"
Normal Pulled 33s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Container image "jimmidyson/configmap-reload:v0.1" already present on machine
Normal Created 33s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Created container
Normal Started 33s kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Started container
Normal Pulled 18s (x3 over 34s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Container image "prom/alertmanager:v0.9.1" already present on machine
Normal Created 18s (x3 over 34s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Created container
Normal Started 18s (x3 over 33s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Started container
Warning BackOff 2s (x4 over 32s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Back-off restarting failed container
Warning FailedSync 2s (x4 over 32s) kubelet, ip-100.200.0.1.ap-northeast-1.compute.internal Error syncing pod
Not sure why it FailedSync.
When you do a kubectl port-forward with that command it makes the port available on your localhost. So run the command and then hit http://localhost:9090.
You won't be able to directly hit the prometheus ports from the public IP, outside the cluster. In the longer run you may want expose prometheus at a nice domain name via ingress (which the chart supports), that's how I'd do it. To use the chart's support for ingress you will need to install an ingress controller in your cluster (like the nginx ingress controller for example), and then enable ingress by setting --set service.ingress.enabled=true and --set server.ingress.hosts[0]=prometheus.yourdomain.com. Ingress is a fairly large topic in itself, so I'll just refer you to the official docs for that one:
https://kubernetes.io/docs/concepts/services-networking/ingress/
And here's the nginx ingress controller:
https://github.com/kubernetes/ingress-nginx
As far as the pod that is showing FailedSync, take a look at the logs using kubectl logs ungaged-woodpecker-prometheus-alertmanager-6f9f8b98ff-qhhw4 to see if there's any additional information there.
I'm trying to run through the kubernetes example in AWS. I created the master and 4 nodes with the kube-up.sh script and trying to get the frontend exposed via a load balancer.
Here are the pods
root#ip-172-20-0-9:~/kubernetes# kubectl get pods
NAME READY STATUS RESTARTS AGE
frontend-2q0at 1/1 Running 0 5m
frontend-5hmxq 1/1 Running 0 5m
frontend-s7i0r 1/1 Running 0 5m
redis-master-y6160 1/1 Running 0 53m
redis-slave-49gya 1/1 Running 0 24m
redis-slave-85u1r 1/1 Running 0 24m
Here are the services
root#ip-172-20-0-9:~/kubernetes# kubectl get services
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
kubernetes 10.0.0.1 <none> 443/TCP <none> 1h
redis-master 10.0.90.210 <none> 6379/TCP name=redis-master 37m
redis-slave 10.0.205.92 <none> 6379/TCP name=redis-slave 24m
I edited the yml for the frontend service to try to add a load balancer but its not showing up
root#ip-172-20-0-9:~/kubernetes# cat examples/guestbook/frontend-service.yaml
apiVersion: v1
kind: Service
metadata:
name: frontend
labels:
name: frontend
spec:
# if your cluster supports it, uncomment the following to automatically create
# an external load-balanced IP for the frontend service.
type: LoadBalancer
ports:
# the port that this service should serve on
- port: 80
selector:
name: frontend
Here the commands i ran
root#ip-172-20-0-9:~/kubernetes# kubectl create -f examples/guestbook/frontend-controller.yaml
replicationcontroller "frontend" created
root#ip-172-20-0-9:~/kubernetes# kubectl get services
NAME CLUSTER_IP EXTERNAL_IP PORT(S) SELECTOR AGE
kubernetes 10.0.0.1 <none> 443/TCP <none> 1h
redis-master 10.0.90.210 <none> 6379/TCP name=redis-master 39m
redis-slave 10.0.205.92 <none> 6379/TCP name=redis-slave 26m
If I remove the loadbalancer it loads up but with no external IP
Looks like the external IP might only be there for Google's platform. in AWS it creates a ELB and doesn't show the external IP of the ELB.