our server running using Kubernetes for auto-scaling and we use newRelic for observability
but we face some issues
1- we need to restart pods when memory usage reaches 1G it automatically restarts when it reaches 1.2G but everything goes slowly.
2- terminate pods when there no requests to the server
my configuration
apiVersion: apps/v1
kind: Deployment
metadata:
name: {{ .Release.Name }}
labels:
app: {{ .Release.Name }}
spec:
revisionHistoryLimit: 2
replicas: {{ .Values.replicas }}
selector:
matchLabels:
app: {{ .Release.Name }}
template:
metadata:
labels:
app: {{ .Release.Name }}
spec:
containers:
- name: {{ .Release.Name }}
image: "{{ .Values.imageRepository }}:{{ .Values.tag }}"
env:
{{- include "api.env" . | nindent 12 }}
resources:
limits:
memory: {{ .Values.memoryLimit }}
cpu: {{ .Values.cpuLimit }}
requests:
memory: {{ .Values.memoryRequest }}
cpu: {{ .Values.cpuRequest }}
imagePullSecrets:
- name: {{ .Values.imagePullSecret }}
{{- if .Values.tolerations }}
tolerations:
{{ toYaml .Values.tolerations | indent 8 }}
{{- end }}
{{- if .Values.nodeSelector }}
nodeSelector:
{{ toYaml .Values.nodeSelector | indent 8 }}
{{- end }}
my values file
memoryLimit: "2Gi"
cpuLimit: "1.0"
memoryRequest: "1.0Gi"
cpuRequest: "0.75"
thats what I am trying to approach
If you want to be sure your pod/deployment won't consume more than 1.0Gi of memory then setting that MemoryLimit will do job just fine.
Once you set that limits and your container exceed it it becomes a potential candidate for termination. If it continues to consume memory beyond its limit, the Container will be terminated. If a terminated Container can be restarted, kubelet restarts it, as with any other type of runtime container failure.
For more readying please visit section exceeding a container's memory limit
Moving on if you wish to scale your deployment based on requests you would require to have custom metrics to be provided by external adapter such as prometheus. Horizontal pod autoascaler natively provides you scaling based only on CPU and Memory (based on the metrics from metrics server).
The adapter documents provides you walkthrough how to configure it with Kubernetes API and HPA. The list of other adapters can be found here.
Then you can scale your deployment based on the http_requests metric as showed here or request-per-seconds as described here.
Related
I'm currently running AWS EFS CSI driver v1.37 on EKS v1.20. The idea is to deploy a statefulset application which can persist its volumes post undeploy, and then reattach for subsequent deployments.
The initial process considered can be seen here - Kube AWS EFS CSI Driver However - the volumes do not reattach.
AWS Support have indicated that perhaps the best approach would be to use the static provisioning, whereby creating the EFS Access Points up front, and assigning them via the persistent volume templates similar to:
{{- $name := include "fullname" . -}}
{{- $labels := include "labels" . -}}
{{- range $k, $v := .Values.persistentVolume }}
{{- if $v.enabled }}
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: {{ $v.metadata.name }}-{{ $name }}
labels:
name: "{{ $v.metadata.name }}-{{ $name }}"
{{- $labels | nindent 4 }}
spec:
capacity:
storage: {{ $v.spec.capacity.storage | quote}}
volumeMode: Filesystem
accessModes:
{{- toYaml $v.spec.accessModes | nindent 4 }}
persistentVolumeReclaimPolicy: {{ $v.spec.persistentVolumeReclaimPolicy }}
storageClassName: {{ $v.spec.storageClassName }}
csi:
driver: efs.csi.aws.com
volumeHandle: {{ $v.spec.csi.volumeHandle }}
volumeAttributes:
encryptInTransit: "true"
{{- end }}
{{- end }}
The key var to note above is:
{{ $v.spec.csi.volumeHandle }}
Whereby the the EFS ID and AP ID can be combined.
Has anyone tried this or something similar in order to establish persistent data volumes, which can be reattached to?
The answer is yes.
When running a statefulset the trick is to swap out the volume claim template, for a persistent volume claim.
The subpath is based on the pod name inside the volume mounts:
- name: data
mountPath: /var/rabbitmq
subPath: $(MY_POD_NAME)
And in turn mount the persistent volume claims inside the volumes:
- name: data
persistentVolumeClaim:
claimName: data-rabbitmq
The persistent volume claim is then tied back to the persistent volume, by setting this inside the persistent volume claim:
volumeName: <pv-name>
Both the persistent volume and persistent volume claim have their storage classes like so:
storageClassName: "\"\""
The persistent volume sets both the EFS ID and EFS AP ID like so:
volumeHandle: fs-123::fsap-456
NB: the EFS AP is created up front via Terraform, not via the AWS EFS CSI driver.
And if sharing a single EFS cluster across multiple EKS clusters, the remaining piece of magic is, to ensure the base path inside the storage class is unique for all volumes, across all applications, this is set inside the storage class like so:
basePath: "/green_infra/queuing/rabbitmq_data"
Happy DevOps :~)
I have an NFS helm chart. It is one of the charts for an application that has 5 more sub-charts. 2 of the charts have a shared storage which I am using NFS. In GCP when I provide NFS service name in the PV it works.
apiVersion: v1
kind: PersistentVolume
metadata:
name: {{ include "nfs.name" . }}
spec:
capacity:
storage: {{ .Values.persistence.nfsVolumes.size }}
accessModes:
- {{ .Values.persistence.nfsVolumes.accessModes }}
mountOptions:
- nfsvers=4.1
nfs:
server: nfs.default.svc.cluster.local # nfs is from svc {{ include "nfs.name" .}}
path: "/opt/shared-shibboleth-idp"
But the same doesn't work on AWS EKS. The error there - on AWS EKS - is connection timeout so it can't mount the volume.
When I change the server to
server: a4eab2d4aef2311e9a2880227e884517-1524131093.us-west-2.elb.amazonaws.com .
I get connection timed out.
All the mounts are okay since it works well with GCP.
What am I doing wrong?
I have a small application built in Django. it serves as a frontend and it's being installed in one of out K8S clusters.
I'm using helm to deploy the charts and I fail to serve the static files of Django correctly.
Iv'e searched in multiple locations, but I ended up with inability to find one that will fix my problem.
That's my ingress file:
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: orion-toolbelt
namespace: {{ .Values.global.namespace }}
annotations:
# ingress.kubernetes.io/secure-backends: "false"
# nginx.ingress.kubernetes.io/secure-backends: "false"
ingress.kubernetes.io/rewrite-target: /
nginx.ingress.kubernetes.io/rewrite-target: /
ingress.kubernetes.io/force-ssl-redirect: "false"
nginx.ingress.kubernetes.io/force-ssl-redirect: "false"
ingress.kubernetes.io/ssl-redirect: "false"
nginx.ingress.kubernetes.io/ssl-redirect: "false"
ingress.kubernetes.io/ingress.allow-http: "true"
nginx.ingress.kubernetes.io/ingress.allow-http: "true"
nginx.ingress.kubernetes.io/proxy-body-size: 500m
spec:
rules:
- http:
paths:
- path: /orion-toolbelt
backend:
serviceName: orion-toolbelt
servicePort: {{ .Values.service.port }}
the static file location in django is kept default e.g.
STATIC_URL = "/static"
the user ended up with inability to access the static files that way..
what should I do next?
attached is the error:
HTML-static_files-error
-- EDIT: 5/8/19 --
The pod's deployment.yaml looks like the following:
apiVersion: apps/v1
kind: StatefulSet
metadata:
namespace: {{ .Values.global.namespace }}
name: orion-toolbelt
labels:
app: orion-toolbelt
spec:
replicas: 1
selector:
matchLabels:
app: orion-toolbelt
template:
metadata:
labels:
app: orion-toolbelt
spec:
containers:
- name: orion-toolbelt
image: {{ .Values.global.repository.imagerepo }}/orion-toolbelt:10.4-SNAPSHOT-15
ports:
- containerPort: {{ .Values.service.port }}
env:
- name: "USERNAME"
valueFrom:
secretKeyRef:
key: username
name: {{ .Values.global.secretname }}
- name: "PASSWORD"
valueFrom:
secretKeyRef:
key: password
name: {{ .Values.global.secretname }}
- name: "MASTER_IP"
valueFrom:
secretKeyRef:
key: master_ip
name: {{ .Values.global.secretname }}
imagePullPolicy: {{ .Values.global.pullPolicy }}
imagePullSecrets:
- name: {{ .Values.global.secretname }}
EDIT2: 20/8/19 - adding service.yam
apiVersion: v1
kind: Service
metadata:
namespace: {{ .Values.global.namespace }}
name: orion-toolbelt
spec:
selector:
app: orion-toolbelt
ports:
- protocol: TCP
port: {{ .Values.service.port }}
targetPort: {{ .Values.service.port }}
You should simply contain the /static directory within the container, and adjust the path to it in the application.
Otherwise, if it must be /static, or you don't want to contain the static files in the container, or you want other containers to access the volume, you should think about mounting a Kubernetes volume to your Deployment/ Statefulset.
#Edit
You can test, whether this path exists in your kubernetes pod this way:
kubectl get po <- this command will give you the name of your pod
kubectl exec -it <name of pod> sh <-this command will let you execute commands in the container shell.
There you can test, if your path exists. If it does, it is fault of your application, if it does not, you added it wrong in the Docker.
You can also add path to your Kubernetes pod, without specifying it in the
Docker container. Check this link for details
As described by community member Marcin Ginszt
According to the informatiom applied in the post. It's difficult to quess where is the problem with your django/app config/settings.
Please refer to Managing static files (e.g. images, JavaScript, CSS)
NOTE:
Serving the files - STATIC_URL = '/static/'
In addition to these configuration steps, you’ll also need to actually serve the static files.
During development, if you use django.contrib.staticfiles, this will be done automatically by runserver when DEBUG is set to True (see django.contrib.staticfiles.views.serve()).
This method is grossly inefficient and probably insecure, so it is unsuitable for production.
See Deploying static files for proper strategies to serve static files in production environments.
Django doesn’t serve files itself; it leaves that job to whichever Web server you choose.
We recommend using a separate Web server – i.e., one that’s not also running Django – for serving media. Here are some good choices:
Nginx
A stripped-down version of Apache
Here you can find example how you can serve static files using collectstatic command.
Please let me know if it helped.
Deployment.yaml
...
env: {{ .Values.env}}
...
Values.yaml:
env:
- name: "DELFI_DB_USER"
value: "yyy"
- name: "DELFI_DB_PASSWORD"
value: "xxx"
- name: "DELFI_DB_CLASS"
value: "com.mysql.jdbc.Driver"
- name: "DELFI_DB_URL"
value: "jdbc:sqlserver://dockersqlserver:1433;databaseName=ddbeta;sendStringParametersAsUnicode=false"
feels like I'm missing something obvious.
linter says: ok
template says:
env: [map[name:DELFI_DB_USER value:yyy] map[name:DELFI_DB_PASSWORD
value:xxx] map[name:DELFI_DB_CLASS value:com.mysql.jdbc.Driver]
map[value:jdbc:mysql://dockersqlserver.{{ .Release.Namespace
}}.svc.cluster.local:3306/ddbeta\?\&\;useSSL=true\&\;requireSSL=false
name:DELFI_DB_URL]]
upgrade says:
Error: UPGRADE FAILED: YAML parse error on
xxx/templates/deployment.yaml: error converting YAML to JSON: yaml:
line 35: found unexpected ':'
solution:
env:
{{- range .Values.env }}
- name: {{ .name | quote }}
value: {{ .value | quote }}
{{- end }}
The current Go template expansion will give output which is not YAML:
env: {{ .Values.env}}
becomes:
env: env: [Some Go type stuff that isn't YAML]...
The Helm Go template needs to loop over the keys of the source YAML dictionary.
This is described in the Helm docs.
The correct Deployment.yaml is:
...
env:
{{- range .Values.env }}
- name: {{ .name | quote }}
value: {{ .value | quote }}
{{- end }}
...
Helm includes undocumented toYaml and toJson template functions; either will work here (because valid JSON is valid YAML). A shorter path could be
env: {{- .Values.env | toYaml | nindent 2 }}
Note that you need to be a little careful with the indentation, particularly if you're setting any additional environment variables that aren't in that list. In this example I've asked Helm to indent the YAML list two steps more, so additional environment values need to follow that too
env: {{- .Values.env | toYaml | nindent 2 }}
- name: OTHER_SERVICE_URL
value: "http://other-service.default.svc.cluster.local"
I need to add database,root or user,password in the following:
- name: deployed-database-instance
type: sqladmin.v1beta4.instance
properties:
backendType: SECOND_GEN
databaseVersion: MYSQL_5_7
settings:
tier: db-f1-micro
I believe this example from this github repo, would be a good place to start testing. From my test I was able to create a instance, database and a user. See my modified version below, of the example I provided, I have mainly just removed the failover replica and modified the delete user block to insert instead of delete:
{% set deployment_name = env['deployment'] %}
{% set instance_name = deployment_name + '-instance' %}
{% set database_name = deployment_name + '-db' %}
resources:
- name: {{ instance_name }}
type: gcp-types/sqladmin-v1beta4:instances
properties:
region: {{ properties['region'] }}
settings:
tier: {{ properties['tier'] }}
backupConfiguration:
binaryLogEnabled: true
enabled: true
- name: {{ database_name }}
type: gcp-types/sqladmin-v1beta4:databases
properties:
name: {{ database_name }}
instance: $(ref.{{ instance_name }}.name)
charset: utf8
- name: insert-user-root
action: gcp-types/sqladmin-v1beta4:sql.users.insert
metadata:
runtimePolicy:
- CREATE
dependsOn:
- {{ database_name }}
properties:
project: {{ env['project'] }}
instance: $(ref.{{ env['deployment'] }}-instance.name)
name: testuser
host: "%"
password: testpass
So what I did was:
1) Clone the repo;
2) Went to directory .\examples\v2\sqladmin\jinja;
3) Modified the sqladmin.jinja file as above;
4) Opened gcloud command prompt and went to said directory in #2;
5) Deployed using 'gcloud deployment-manager deployments create my-database --config sqladmin.yaml'
All you would need to do is play with the name of resources.
I generated this from Python, but I think in jinja it would be:
properties:
region: {{ properties['region'] }}
rootPassword: '12345'
settings:
tier: {{ properties['tier'] }}
backupConfiguration:
binaryLogEnabled: true
enabled: true
I just found this out today, sorry for the late reply.