How to add insecure Docker registry certificate to kubeadm config - amazon-web-services

I'm quite new to Kubernetes, and I managed to get an Angular app deployed locally using minikube. But now I'm working on a Bitnami Kubernetes Sandbox EC2 instance, and I've run into issues pulling from my docker registry on another EC2 instance.
Whenever I attempt to apply the deployment, the pods log the following error
Failed to pull image "registry-url.net:5000/app": no available registry endpoint:
failed to do request: Head https://registry-url.net/v2/app/manifests/latest:
x509: certificate signed by unknown authority
The docker registry certificate is signed by a CA (Comodo RSA), but I had to add the registry's .crt and .key files to /etc/docker/certs.d/registry-url.net:5000/ for my local copy of minikube and docker.
However, the Bitnami instance doesn't have an /etc/docker/ directory and there is no daemon.json file to add insecure registry exceptions, and I'm not sure where the cert files are meant to be located for kubeadm.
So is there a similar location to place .crt and .key files for kubeadm, or is there a command I can run to add my docker registry to a list of exceptions?
Or better yet, is there a way to get Kubernetes/docker to recognize the CA of the registry's SSL certs?
Thanks
Edit: I've included my deployment and secret files below:
app-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: app-deployment
spec:
replicas: 1
selector:
matchLabels:
app: app
template:
metadata:
labels:
app: app
spec:
containers:
- name: app
image: registry-url.net:5000/app
ports:
- containerPort: 80
env:
...
imagePullSecrets:
- name: registry-pull-secret
registry-secret.yaml
apiVersion: v1
kind: Secret
metadata:
name: registry-pull-secret
data:
.dockerconfigjson: <base-64 JSON>
type: kubernetes.io/dockerconfigjson

You need to create a secret with details for the repository.
This might be the example of uploading the image to your docker repo:
docker login _my-registry-url_:5000
Username (admin):
Password:
Login Succeeded
docker tag _user_/_my-cool-image_ _my-registry-url_:5000/_my-cool-image_:0.1
docker push _my-registry-url_:5000/_my-cool-image_:0.1
From that host you should create the base64 of ~/.docker/config.json like so
cat ~/.docker/config.json | base64
Then you will be able to add it to the secret, so create a yaml that might look like the following:
apiVersion: v1
kind: Secret
metadata:
name: registrypullsecret
data:
.dockerconfigjson: <base-64-encoded-json-here>
type: kubernetes.io/dockerconfigjson
Once done you can apply the secret using kubectl create -f my-secret.yaml && kubectl get secrets.
As for your pod it should look like this:
apiVersion: v1
kind: Pod
metadata:
name: jss
spec:
imagePullSecrets:
— name: registrypullsecret
containers:
— name: jss
image: my-registry-url:5000/my-cool-image:0.1

So I ended up solving my issue by manually installing docker via the following commands:
sudo add-apt-repository \
"deb [arch=amd64] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) \
stable"
sudo apt-get install docker-ce docker-ce-cli containerd.io
Then I had to create the directory structure /etc/docker/certs.d/registry-url:5000/ and copy the registry's .crt and .key files into the directory.
However, this still didn't work; but after stopping the EC2 instance and starting it again, it appears to pull from the remote registry with no issues.
When I initially ran service kubelet restart the changes didn't seem to take effect, but restarting did the trick. I'm not sure if there's a bettre way of fixing my issue, but this was the only solution that worked for me.

Related

Google managed Cloud run container fails to start on deploy from CLI but the same image works when manually deploying via dashboard

So I have this issue, I have a (currently) only local devops process which is just a series of commands in bash building a docker container for a nodejs application and uploading to google container registry and then deploying it to google cloud run from there.
The issue I'm having is the deployment step always fails throwing:
ERROR: (gcloud.beta.run.services.replace) Cloud Run error: Container failed to start. Failed to start and then listen on the port defined by the PORT environment variable. and there's nothing in the logs when I follow the link or manually try to access the log for that service in cloud run.
At some point I had a code issue which was preventing the container from starting and I could see that error in the cloud run logs.
I'm using the following command & yaml to deploy:
gcloud beta run services replace .gcp/cloud_run/auth.yaml
and my yaml file:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: auth-service
spec:
template:
spec:
containers:
- image: gcr.io/my_project_id/auth-service
serviceAccountName: abc#my_project_id.iam.gserviceaccount.com
EDIT:
I have since pulled the yaml file configuration for the service that I manually deployed, and it looks something like this:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
annotations:
client.knative.dev/user-image: gcr.io/my_project_id/auth-service
run.googleapis.com/ingress: all
run.googleapis.com/ingress-status: all
run.googleapis.com/launch-stage: BETA
labels:
cloud.googleapis.com/location: europe-west2
name: auth-service
namespace: "1032997338375"
spec:
template:
metadata:
annotations:
autoscaling.knative.dev/maxScale: "2"
run.googleapis.com/client-name: cloud-console
run.googleapis.com/sandbox: gvisor
name: auth-service-00002-nux
spec:
containerConcurrency: 80
containers:
- image: gcr.io/my_project_id/auth-service
ports:
- containerPort: 3000
resources:
limits:
cpu: 1000m
memory: 512Mi
serviceAccountName: abc#my_project_id.iam.gserviceaccount.com
timeoutSeconds: 300
traffic:
- latestRevision: true
percent: 100
I've changed the name to the service I'm trying to deploy from the command line and deployed it as a new service just like before, and it worked right away without further modifications.
Although I'm not sure which of the configurations I'm missing in my initial file as the documentation on the YAML for cloud run deployments doesn't specify a minimum configuration.
Any ideas which configs I can keep & which can be filtered out?
If you check both yaml files, you can find the property containerPort in the file generated by the console
By default cloud run performs a healtcheck test and expects listen something in the port 8080 or in this example the dockerfile will run over the port that Docker/Cloud Run sent to the container
In your case you are running a container that runs over the port 3000, if you don't declare the port, cloud run can't run your image because is not detecting anything on 8080
You can define the yaml as this:
apiVersion: serving.knative.dev/v1
kind: Service
metadata:
name: auth-service
spec:
template:
spec:
containers:
- image: gcr.io/myproject/myimage:latest
ports:
- containerPort: 3000
serviceAccountName: abc#my_project_id.iam.gserviceaccount.com

pod-identity-webhook missing after EKS 1.16 upgrade

After upgrading to EKS 1.16, IAM Roles for Service Account stopped working.
It was configured as described in the article, configuring and assigning service accounts to pods, and worked with EKS 1.14 and 1.15.
Running service-account.yaml and test-pod.yaml on EKS 1.15 (qa env) does mount the following env variables
AWS_ROLE_ARN=arn:aws:iam::xxx:role/oidc-my-service-api-qa
AWS_WEB_IDENTITY_TOKEN_FILE=/var/run/secrets/eks.amazonaws.com/serviceaccount/token
While running same resources on EKS 1.16 (test env), they are not added.
service-account.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::xxxxxxxxx:role/oidc-my-service-test
name: oidc-my-service-service-account
test-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: test
spec:
containers:
- name: test
image: busybox
command: ["/bin/sh", "-c", "env | grep AWS"]
securityContext:
fsGroup: 1000
serviceAccountName: "oidc-my-service-service-account"
UPDATE
Turns out I'm missing Amazon EKS Pod Identity Webhook, but where did it go?
EKS 1.15
kubectl get mutatingwebhookconfigurations pod-identity-webhook
NAME CREATED AT
pod-identity-webhook 2020-01-11T17:01:52Z
EKS 1.16
kubectl get mutatingwebhookconfigurations pod-identity-webhook
Error from server (NotFound): mutatingwebhookconfigurations.admissionregistration.k8s.io "pod-identity-webhook" not found
I was able to re-install amazon-eks-pod-identity-webhook in my EKS 1.16 cluster by cloning the code from github, building docker image and pushing it to my ECR repo, and running
make cluster-up IMAGE=my-account-id.dkr.ecr.my-region.amazonaws.com/pod-identity-webhook:latest
Still open question where did it go, as it's part of managed service as stated in this comment

Unable to configure kubernetes URL with kubernetes-Jenkins plugin

Am new to kubernetes and trying out Jenkins kubernetes plugin. I have created a K8s cluster and namespace called jenkins-pl in AWS. Below are my Jenkins deployment and service yaml files:
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: jenkins
spec:
replicas: 1
template:
metadata:
labels:
app: jenkins
spec:
containers:
- name: jenkins
image: contactsai123/my-jenkins-image:1.0
env:
- name: JAVA_OPTS
value: -Djenkins.install.runSetupWizard=false
ports:
- name: http-port
containerPort: 8080
- name: jnlp-port
containerPort: 50000
volumeMounts:
- name: jenkins-home
mountPath: /var/jenkins_home
volumes:
- name: jenkins-home
emptyDir: {}
Here is my jenkins-service.yaml file
apiVersion: v1
kind: Service
metadata:
name: jenkins
spec:
type: LoadBalancer
ports:
- port: 8080
targetPort: 8080
selector:
app: jenkins
Am able to launch Jenkins successfully, am unsure on what should I provide in kubernetes URL.
I gave "https://kubernetes.default.svc.cluster.local" and get the error message:
Error testing connection https://kubernetes.default.svc.cluster.local: Failure executing: GET at: https://kubernetes.default.svc.cluster.local/api/v1/namespaces/jenkins-pl/pods. Message: Forbidden!Configured service account doesn't have access. Service account may have been revoked. pods is forbidden: User "system:serviceaccount:jenkins-pl:default" cannot list pods in the namespace "jenkins-pl".
I executed the command:
$ kubectl cluster-info | grep master
and got the following output:
https://api-selegrid-k8s-loca-m23tbb-1891259367.us-west-2.elb.amazonaws.com
I provided the above in Kubernetes URL, for which I get the similar error as before.
Not sure how to move forward?
Your cluster has RBAC enabled. You have to give your deployment necessary RBAC permission to list pods.
Consider your deployment as a user who need to perform some task in your cluster. So, you have to provide it necessary permission.
At first you have to create a role. It could be ClusterRole or Role.
This role define what can be done under this role. A ClusterRole give permission to do some task in cluster scope where Role give permission only in a particular namespace.
Then, you have to create a Service Account. Consider service account as a user. It is for application instead of a person.
Finally, you have to bind Role or ClusterRole to the service account through RoleBinding or ClusterRoleBinding. This actually tell that which user/service can access permissions defined under which roles.
Check this nice post to understand RBAC: Configuring permissions in Kubernetes with RBAC
Also this video might help you to understand the basics: Role Based Access Control (RBAC) with Kubernetes

How to expose my pod to the internet and get to it from the browser?

first of all I downloaded kubernetes, kubectl and created a cluster from aws (export KUBERNETES_PROVIDER=aws; wget -q -O - https://get.k8s.io | bash
)
I added some lines to my project circle.yml to use circleCI services to build my image.
To support docker I added:
machine:
services:
- docker
and to create my image and send it to my artifacts I added:
deployment:
commands:
- docker login -e admin#comp.com -u ${ARTUSER} -p ${ARTKEY} docker-docker-local.someartifactory.com
- sbt -DBUILD_NUMBER="${CIRCLE_BUILD_NUM}" docker:publish
After that I created a 2 folders:
my project (MyApp) folder with two files:
controller.yaml
apiVersion: v1
kind: ReplicationController
metadata:
name: MyApp
labels:
name: MyApp
spec:
replicas: 1
selector:
name: MyApp
template:
metadata:
labels:
name: MyApp
version: 0.1.4
spec:
containers:
- name: MyApp
#this is the image artifactory
image: docker-docker-release.someartifactory.com/MyApp:0.1.4
ports:
- containerPort: 9000
imagePullSecrets:
- name: myCompany-artifactory
service.yaml
apiVersion: v1
kind: Service
metadata:
name: MyApp
labels:
name: MyApp
spec:
# if your cluster supports it, uncomment the following to automatically create
# an external load-balanced IP for the frontend service.
type: LoadBalancer
ports:
# the port that this service should serve on
- port: 9000
selector:
name: MyApp
And I have another folder for my artifactory (Kind : Secret).
Now I created my pods with:
kubectl create -f controller.yaml
And now I have my pod running when I check in kubectl get pods.
Now, how do I access my pod from the browser? my project is a play project so I want to get to it from the browser...how do I expose it the simplest way?
thanks
The Replication Controller sole responsibility is ensuring that the amount of pods with the given configuration are run on your cluster.
The Service is what is public (or internally) exposing your pods to other parts of the system (or the internet).
You should create your service with your yaml file (kubectl create -f service.yaml) which will create the service, selecting pods by the label selector MyApp for handling the load on the given port in your file (9000).
Afterwards look at the registered service with kubectl get service to see which endpoint (ip) is allocated for it.

Why doesn't my pod respond to requests on the exposed port?

I've just launched a fairly basic cluster based on the CoreOS kube-aws scripts.
https://coreos.com/kubernetes/docs/latest/kubernetes-on-aws.html
I've activated the registry add-on, and I have it correctly proxying to my local box so I can push images to the cluster on localhost:5000. I also have the proxy pod correctly loaded on each node so that localhost:5000 will also pull images from that registry.
https://github.com/kubernetes/kubernetes/tree/master/cluster/addons/registry
Then I dockerized a fairly simple Sinatra app to run on my cluster and pushed it to the registry. I also prepared a ReplicationController definition and Service definition to run the app. The images pulled and started no problem, I can use kubectl to get the startup logs from each pod that belongs to the replication group.
My problem is that when I curl the public ELB endpoint for my service, it just hangs.
Things I've tried:
I got the public IP for one of the nodes running my pod and attempted to curl it at the NodePort described in the service description, same thing.
I SSH'd into that node and attempted curl localhost:3000, same result.
Also SSH'd into that node, I attempted to curl <pod-ip>:3000, same result.
ps shows the Puma process running and listening on port 3000.
docker ps on the node shows that the app container is not forwarding any ports to the host. Is that maybe the problem?
The requests must be routing correctly because hitting those IPs at any other port results in a connection refused rather than hanging.
The Dockerfile for my app is fairly straightforward:
FROM ruby:2.2.4-onbuild
RUN apt-get update -qq && apt-get install -y \
libpq-dev \
postgresql-client
RUN mkdir -p /app
WORKDIR /app
COPY . /app
EXPOSE 3000
ENTRYPOINT ['ruby', '/app/bin/entrypoint.rb']
Where entrypoint.rb will start a Puma server listening on port 3000.
My replication group is defined like so:
apiVersion: v1
kind: ReplicationController
metadata:
name: web-controller
namespace: app
spec:
replicas: 2
selector:
app: web
template:
metadata:
labels:
app: web
spec:
volumes:
- name: secrets
secret:
secretName: secrets
containers:
- name: app
image: localhost:5000/app:v2
resources:
limits:
cpu: 100m
memory: 50Mi
env:
- name: DATABASE_NAME
value: app_production
- name: DATABASE_URL
value: postgresql://some.postgres.aws.com:5432
- name: ENV
value: production
- name: REDIS_URL
value: redis://some.redis.aws.com:6379
volumeMounts:
- name: secrets
mountPath: "/etc/secrets"
readOnly: true
command: ['/app/bin/entrypoint.rb', 'web']
ports:
- containerPort: 3000
And here is my service:
apiVersion: v1
kind: Service
metadata:
name: web-service
spec:
ports:
- port: 80
targetPort: 3000
protocol: TCP
selector:
app: web
type: LoadBalancer
Output of kubectl describe service web-service:
Name: web-service
Namespace: app
Labels: <none>
Selector: app=web
Type: LoadBalancer
IP: 10.3.0.204
LoadBalancer Ingress: some.elb.aws.com
Port: <unnamed> 80/TCP
NodePort: <unnamed> 32062/TCP
Endpoints: 10.2.47.3:3000,10.2.73.3:3000
Session Affinity: None
No events.
docker ps on one of the nodes shows that the app container is not forwarding any ports to the host. Is that maybe the problem?
Edit to add entrypoint.rb and Procfile
entrypoint.rb:
#!/usr/bin/env ruby
db_user_file = '/etc/secrets/database_user'
db_password_file = '/etc/secrets/database_password'
ENV['DATABASE_USER'] = File.read(db_user_file) if File.exists?(db_user_file)
ENV['DATABASE_PASSWORD'] = File.read(db_password_file) if File.exists?(db_password_file)
exec("bundle exec foreman start #{ARGV[0]}")
Procfile:
web: PORT=3000 bundle exec puma
message_worker: bundle exec sidekiq -q messages -c 1 -r ./config/environment.rb
email_worker: bundle exec sidekiq -q emails -c 1 -r
There was nothing wrong with my Kubernetes set up. It turns out that the app was failing to start because the connection to the DB was timing out due to some unrelated networking issue.
For anyone curious: don't launch anything external to Kubernetes in the 10.x.x.x IP range (e.g. RDS, Elasticache, etc). Long story short, Kubernetes currently has an IPTables masquerade rule hardcoded that messes up communication with anything in that range that isn't part of the cluster. See the details here.
What I ended up doing was creating a separate VPC for my data stores on a different IP range and peering it with my Kubernetes VPC.