Replace specific values using Kustomize - amazon-web-services

I am evaluating Kustomize as a templating solution for my Project. I want an option to replace specific key-value pairs.
ports:
- containerPort: 8081
resources:
limits:
cpu: $CPU_LIMIT
memory: $MEMORY_LIMIT
requests:
cpu: $CPU_REQUESTS
memory: $MEMORY_REQUESTS
In the above example, I want to replace CPU_LIMIT with a config-driven value. What options do I have to do this with Kustomize?

Kustomize doesn't do direct variable replacement like a templating engine. But there are some solutions depending on what attributes you need to variabalize.
Usually variables in deployments, statefulsets, daemonset, pod, job, etc, attributes allow you to use variables powered by a configmap, so you don't necessarily have to use a variable at compile time. However, this doesn't work when controlling values like resource limits and requests, as those would be processed before configmaps would be mounted.
Kustomize isn't designed to be a templating engine, it's designed as a purely declarative approach to configuration management, this includes the ability to use patches for overlays (overrides) and reference resources to allow you to DRY (Do-Not Repeat Yourself) which is especially useful when your configuration powers multiple Kubernetes clusters.
For Kustomize, maybe consider if patching might meet your needs. There are several different ways that Kustomize can patch a file. If you need to change individual attributes, you can use the patchesJSON6902, although when you have to change a lot of values in a deployment, changing them one at a time this way is cumbersome, instead use something like patchesStrategicMerge
Consider the following way to use a patch (overlay):
.
├── base
│   └── main
│   ├── kustomization.yaml
│   └── resource.yaml
└── cluster
├── kustomization.yaml
└── pod_overlay.yaml
Contents of base/main/resource.yaml:
---
apiVersion: v1
kind: Pod
metadata:
name: site
labels:
app: web
spec:
containers:
- name: front-end
image: nginx
ports:
- containerPort: 8081
resources:
requests:
cpu: 100m
memory: 4Gi
limits:
cpu: 200m
memory: 8Gi
Contents of cluster/pod_overlay.yaml:
---
apiVersion: v1
kind: Pod
metadata:
name: site
spec:
containers:
- name: front-end
resources:
requests:
cpu: 200m
memory: 8Gi
limits:
cpu: 400m
memory: 16Gi
Note that we only included the selectors (kind, metadata.name, spec.containers[0].name) and the values we wanted to replace, in this case the resource requests and limits. You don't have to duplicate the entire resource for the patch to apply.
Now to apply the patch with kustomize, the contents of cluster/kustomization.yaml:
apiVersion: kustomize.config.k8s.io/v1beta1
kind: Kustomization
resources:
- ../base/main
patchesStrategicMerge:
- pod_overlay.yaml
Another option to consider if you really need templating power is to use Helm.
Helm is a much more robust templating engine that you may want to consider, and you can use a combination of Helm for templating and the Kustomize for resource management, patches for specific configuration, and overlays.

Related

Helm hook for post-install, post-upgrade using busybox wget is failing

I am trying to deploy a Helm hook post-install, post-upgrade hook which will create a simple pod with busybox and perform a wget on an app's application port to insure the app is reachable.
I can not get the hook to pass, even though I know the sample app is up and available.
Here is the manifest:
apiVersion: v1
kind: Pod
metadata:
name: post-install-test
annotations:
"helm.sh/hook": post-install,post-upgrade
"helm.sh/hook-delete-policy": before-hook-creation,hook-succeeded
spec:
containers:
- name: wget
image: busybox
imagePullPolicy: IfNotPresent
command: ["/bin/sh","-c"]
args: ["sleep 15; wget {{ include "sampleapp.fullname" . }}:{{ .Values.service.applicationPort.port }}"]
restartPolicy: Never
As you can see in the manifest in the args, the name of the container is in Helm's template syntax. A developer will input the desired name of their app in a Jenkins pipeline, so I can't hardcode it.
I see from kubectl logs -n namespace post-install-test, this result:
Connecting to sample-app:8080 (172.20.87.74:8080)
wget: server returned error: HTTP/1.1 404 Not Found
But when I check the EKS resources I see the pod running the sample app that I'm trying to test with the added suffix of what I've determined is the pod-template-hash.
sample-app-7fcbd52srj9
Is this suffix making my Helm hook fail? Is there a way I can account for this template hash?
I've tried different syntaxes on the command, but I can confirm with the kubectl logs the helm hook is attempting to connect but keeps getting a 404.

Flask application: Using gunicorn on ECS Fargate , How to

Info:
I created a flask app and on my Dockerfile last command CMD gunicorn -b 0.0.0.0:5000 --access-logfile - "app:create_app()"
I build,tag and upload image on ECR
I used this docker image to create an ECS Fargate instance with the following configs (just posting the one needed for the question):
ECSTaskDefinition:
Type: AWS::ECS::TaskDefinition
Properties:
Cpu: "256"
Memory: "1024"
RequiresCompatibilities:
- FARGATE
ContainerDefinitions:
- Name: contained_above
.
.
.
ECSService:
Type: AWS::ECS::Service
DependsOn: ListenerRule
Properties:
Cluster: !Sub "${EnvName}-ECScluster"
DesiredCount: 1
LaunchType: FARGATE
DeploymentConfiguration:
MaximumPercent: 200
MinimumHealthyPercent: 50
NetworkConfiguration:
AwsvpcConfiguration:
AssignPublicIp: ENABLED
Subnets:
- Fn::ImportValue: !Sub "${EnvName}-PUBLIC-SUBNET-1"
- Fn::ImportValue: !Sub "${EnvName}-PUBLIC-SUBNET-2"
SecurityGroups:
- Fn::ImportValue: !Sub "${EnvName}-CONTAINER-SECURITY-GROUP"
ServiceName: !Sub "${EnvName}-ECS-SERVICE"
TaskDefinition: !Ref ECSTaskDefinition
LoadBalancers:
- ContainerName: contained_above
ContainerPort: 5000
TargetGroupArn: !Ref TargetGroup
(App is working normally)
Question
Now my question is what number should be the workers on gunicorn command (my last command in dockerfile)?
On gunicorn design it is stated to use Generally we recommend (2 x $num_cores) + 1 as the number of workers to start off with.
So whats the number of cores on a fargate? Does actually make sense to combine gunicorn with Fargate like the above process? Is there 'compatibility' between loadbalancers and gunicorn workers? What is the connection between DesiredCount of ECS Service and the gunicorn -w workers value? Am I missing or miss-understanding something?
Possible solution(?)
One way that I could call it is the following:
CMD gunicorn -b 0.0.0.0:5000 -w $(( 2 * `cat /proc/cpuinfo | grep 'core id' | wc -l` + 1 )) --access-logfile - "app:create_app()"
But I am not sure if that would be a good solution.
Any insights? Thanks
EDIT: I'm using a configuration file for gunicorn to use when starting:
gunicorn.conf.py
import multiprocessing
bind = "0.0.0.0:8080"
workers = multiprocessing.cpu_count() * 2 + 1
worker_class = "uvicorn.workers.UvicornH11Worker"
keepalive = 0
you can tell gunicorn which config file to use with the --config flag.
Sadly I can't find the source anymore, but I've read that 4-12 workers should be enough to handle hundreds if not thousands of simultaneous requests - depending on your application structure, worker class and payload size.
Do take this with a grain of salt tho, since I can't find the source anymore, but it was in an accepted SO answer from a well-reputated person if I remember correctly.
Offical gunicorn docs state somthing in the 2-4 x $(NUM_CORES)range.
Another option would be as gunicorn docs state at another point:
Generally we recommend (2 x $num_cores) + 1 as the number of workers
to start off with. While not overly scientific, the formula is based
on the assumption that for a given core, one worker will be reading or
writing from the socket while the other worker is processing a
request.
Obviously, your particular hardware and application are going to
affect the optimal number of workers. Our recommendation is to start
with the above guess and tune using TTIN and TTOU signals while the
application is under load.
So far I've been running well with holding true to the 4-12 worker recommendation. My company runs several APIs, which connect to other APIs out there, which results in mostly 1-2seconds request time, with the longest taking up to a whole minute (a lot of external API calls here).
Another colleague I talked to mentioned, they are using 1 worker per 5 simultaneous requests they expect - with similar APIs to ours. Works fine for them as well.

Configuring Concourse CI to use AWS Secrets Manager

I have been trying to figure out how to configure the docker version of Concourse (https://github.com/concourse/concourse-docker) to use the AWS Secrets Manager and I added the following environment variables into the docker-compose file but from the logs it doesn't look like it ever reaches out to AWS to fetch the creds. Am I missing something or should this automatically happen when adding these environment variables under environment in the docker-compose file? Here are the docs I have been looking at https://concourse-ci.org/aws-asm-credential-manager.html
version: '3'
services:
concourse-db:
image: postgres
environment:
POSTGRES_DB: concourse
POSTGRES_PASSWORD: concourse_pass
POSTGRES_USER: concourse_user
PGDATA: /database
concourse:
image: concourse/concourse
command: quickstart
privileged: true
depends_on: [concourse-db]
ports: ["9090:8080"]
environment:
CONCOURSE_POSTGRES_HOST: concourse-db
CONCOURSE_POSTGRES_USER: concourse_user
CONCOURSE_POSTGRES_PASSWORD: concourse_pass
CONCOURSE_POSTGRES_DATABASE: concourse
CONCOURSE_EXTERNAL_URL: http://XXX.XXX.XXX.XXX:9090
CONCOURSE_ADD_LOCAL_USER: test: test
CONCOURSE_MAIN_TEAM_LOCAL_USER: test
CONCOURSE_WORKER_BAGGAGECLAIM_DRIVER: overlay
CONCOURSE_AWS_SECRETSMANAGER_REGION: us-east-1
CONCOURSE_AWS_SECRETSMANAGER_ACCESS_KEY: <XXXX>
CONCOURSE_AWS_SECRETSMANAGER_SECRET_KEY: <XXXX>
CONCOURSE_AWS_SECRETSMANAGER_TEAM_SECRET_TEMPLATE: /concourse/{{.Secret}}
CONCOURSE_AWS_SECRETSMANAGER_PIPELINE_SECRET_TEMPLATE: /concourse/{{.Secret}}
pipeline.yml example:
jobs:
- name: build-ui
plan:
- get: web-ui
trigger: true
- get: resource-ui
- task: build-task
file: web-ui/ci/build/task.yml
- put: resource-ui
params:
repository: updated-ui
force: true
- task: e2e-task
file: web-ui/ci/e2e/task.yml
params:
UI_USERNAME: ((ui-username))
UI_PASSWORD: ((ui-password))
resources:
- name: cf
type: cf-cli-resource
source:
api: https://api.run.pivotal.io
username: ((cf-username))
password: ((cf-password))
org: Blah
- name: web-ui
type: git
source:
uri: git#github.com:blah/blah.git
branch: master
private_key: ((git-private-key))
When storing parameters for concourse pipelines in AWS Secrets Manager, it must follow this syntax,
/concourse/TEAM_NAME/PIPELINE_NAME/PARAMETER_NAME`
If you have common parameters that are used across the team in multiple pipelines, use this syntax to avoid creating redundant parameters in secrets manager
/concourse/TEAM_NAME/PARAMETER_NAME
The highest level that is supported is concourse team level.
Global parameters are not possible. Thus these variables in your compose environment will not be supported.
CONCOURSE_AWS_SECRETSMANAGER_TEAM_SECRET_TEMPLATE: /concourse/{{.Secret}}
CONCOURSE_AWS_SECRETSMANAGER_PIPELINE_SECRET_TEMPLATE: /concourse/{{.Secret}}
Unless you want to change the prefix /concourse, these parameters shall be left to their defaults.
And, when retrieving these parameters in the pipeline, no changes required in the template. Just pass the PARAMETER_NAME, concourse will handle the lookup in secrets manager as per the team and pipeline name.
...
params:
UI_USERNAME: ((ui-username))
UI_PASSWORD: ((ui-password))
...

Soft memory limit for AWS ECS task defined by docker-compose.yaml

Amazon provides ecs-cli compose command which can set up task definition from docker-compose.yaml
But I am not able to declare memory limits (especially soft one) for such task. Deploy option is not supported.
Skipping unsupported YAML option for service... option name=deploy
Is there way how to achieve this with compose? Or is using compose bad idea and it's better to use native task definitions.
update
My compose file was requested, here is it
version: '3'
services:
worker:
image: 880289074637.dkr.ecr.us-east-1.amazonaws.com/negative-keywords:latest
env_file: .env
command: ["celery", "-A", "negmatch", "worker", "-l", "info"]
deploy:
resources:
limits:
cpus: '0.50'
memory: 256M
reservations:
cpus: '0.25'
memory: 128M
web:
image: 880289074637.dkr.ecr.us-east-1.amazonaws.com/negative-keywords:latest
env_file: .env
ports:
- "80:8000"
depends_on:
- "worker"
deploy:
resources:
limits:
cpus: '0.50'
memory: 256M
reservations:
cpus: '0.25'
memory: 128M
You will need to use v2 of docker compose to set the values.
As of today, according docker docs, deploy is meant for swarm mode deployment only.
Looking for options to set resources on non swarm mode containers?
The options described here are specific to the deploy key and swarm
mode. If you want to set resource constraints on non swarm
deployments, use Compose file format version 2 CPU, memory, and other
resource options. If you have further questions, refer to the
discussion on the GitHub issue docker/compose/4513.
More info on using v2 vs v3.
https://github.com/docker/compose/issues/4513#issuecomment-377311337
Here is sample docker-compose(v2) which sets the soft and hard memory limits on container definition of task. mem_limit is hard limit and mem_reservation is soft limit.
Command -
ecs-cli compose --project-name nginx --file docker-compose.yaml create
Compose File -
version: '2'
services:
nginx:
image: "nginx:latest"
mem_limit: 512m
mem_reservation: 128m
cpu_shares: 0
ports:
- 80

AWS EC2 issue with Docker Swarm using dnsrr to setup an ElasticSearch cluster discovery

First of all, the issue is not specific to ElasticSearch, I think (so as not to discourage some potential answers).
I'm using a docker service with dnsrr (DNS round-robin) to allow for the discovery of every node in the cluster: they always try the hostname 'elastic' and (should) get different IPs.
This works perfectly fine when I create 3 VMs on my local machine, but I can't figure out why when I run it on 3 EC2 machines, the one configured as swarm leader only tries its own IP, while the two workers discover each other without issue.
I'm fairly new to AWS, so I guess it must be some kind of misconfiguration somewhere, but I can't figure out what to check.
Thanks in advance if you have any idea as to what may cause this, and even better if you come up with a solution!
The docker compose file as use is as below, simplified to the max to isolate the problem.
version: "3.3"
services:
elastic:
image: docker.elastic.co/elasticsearch/elasticsearch:5.5.2
environment:
- ES_JAVA_OPTS=-Xms1g -Xmx1g
- discovery.zen.ping.unicast.hosts=elastic
- discovery.zen.minimum_master_nodes=2
volumes:
- elastic_data:/usr/share/elasticsearch/data
networks:
- overnet
logging:
driver: "json-file"
options:
max-size: "20m"
max-file: "10"
deploy:
mode: global
endpoint_mode: dnsrr
networks:
overnet:
driver: overlay
driver_opts:
encrypted: "true"
volumes:
elastic_data:
external: true
Try re-creating without encryption enabled to see if that works.
Also be sure you have a Security group between the three nodes with all the proper ports open between them.