Very similar to another question, but there are slight differences. Tried the accepted answer and still no luck.
I get this error when I run the command:
bosh -d prometheus deploy -n pfg-prometheus-boshrelease/manifests/prometheus.yml -o replace_vars.yml
Expected to find a map at path '/instance_groups/name=prometheus2/jobs/name=prometheus2/properties/prometheus/scrape_configs/job_name=bosh/static_configs/targets?' but found '[]interface {}'
replace_vars.yml:
- type: replace
path: /instance_groups/name=prometheus2/jobs/name=prometheus2/properties/prometheus/scrape_configs/job_name=bosh/static_configs/targets?/-
value: 192.168.123.26:9190
Manifest section:
- name: prometheus2
properties:
prometheus:
rule_files:
- ...
scrape_configs:
- file_sd_configs:
- files:
- /var/vcap/store/bosh_exporter/bosh_target_groups.json
job_name: prometheus
relabel_configs:
- action: keep
...
- regex: (.*)
...
- job_name: bosh
scrape_interval: 2m
scrape_timeout: 1m
static_configs:
- targets:
- localhost:9190
What would the correct path be?
EDIT: I have looked through bosh cli ops files but cannot find an example like mine.
I also stumpled upon this several times and never found a solution for this use case. What I usually do as a workaround is to replace one step up.
For your example:
/tmp/replace-vars.yml:
- type: replace
path: /instance_groups/name=prometheus2/jobs/name=prometheus2/properties/prometheus/scrape_configs/job_name=bosh/static_configs/0
value:
targets:
- 192.168.123.26:9190
- localhost:9190
/tmp/test-manifest.yml:
instance_groups:
- name: prometheus2
jobs:
- name: prometheus2
properties:
prometheus:
rule_files:
- abc
scrape_configs:
- file_sd_configs:
- files:
- /var/vcap/store/bosh_exporter/bosh_target_groups.json
job_name: prometheus
relabel_configs:
- action: keep
- regex: (.*)
- job_name: bosh
scrape_interval: 2m
scrape_timeout: 1m
static_configs:
- targets:
- localhost:9190
Interpolated by bosh int /tmp/test-manifest.yml -o /tmp/replace-vars.yml:
instance_groups:
- jobs:
- name: prometheus2
properties:
prometheus:
rule_files:
- abc
scrape_configs:
- file_sd_configs:
- files:
- /var/vcap/store/bosh_exporter/bosh_target_groups.json
job_name: prometheus
relabel_configs:
- action: keep
- regex: (.*)
- job_name: bosh
scrape_interval: 2m
scrape_timeout: 1m
static_configs:
- targets:
- 192.168.123.26:9190
- localhost:9190
name: prometheus2
Related
I have a Django project deployed in Kubernetes and I am trying to deploy Prometheus as a monitoring tool. I have successfully done all the steps needed to include django_prometheus in the project and locally I can go go localhost:9090 and play around with querying the metrics.
I have also deployed Prometheus to my Kubernetes cluster and upon running a kubectl port-forward ... on the Prometheus pod I can see some metrics of my Kubernetes resources.
Where I am a bit confused is how to make the deployed Django app metrics available on the Prometheus dashboard just like the others.
I deployed my app in default namespace and prometheus in a monitoring dedicated namespace. I am wondering what am I missing here. Do I need to expose the ports on the service and deployment from 8000 to 8005 according to the number of workers or something like that?
My Django app runs with gunicorn using supervisord like so:
[program:gunicorn]
command=gunicorn --reload --timeout 200000 --workers=5 --limit-request-line 0 --limit-request-fields 32768 --limit-request-field_size 0 --chdir /code/ my_app.wsgi
my_app service:
apiVersion: v1
kind: Service
metadata:
name: my_app
namespace: default
spec:
ports:
- name: http
port: 80
protocol: TCP
targetPort: 80
selector:
app: my-app
sessionAffinity: None
type: ClusterIP
Trimmed version of the deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
labels:
app: my-app
name: my-app-deployment
namespace: default
spec:
progressDeadlineSeconds: 600
replicas: 1
revisionHistoryLimit: 10
selector:
matchLabels:
app: my-app
strategy:
rollingUpdate:
maxSurge: 25%
maxUnavailable: 25%
type: RollingUpdate
template:
metadata:
labels:
app: my-app
spec:
containers:
- image: ...
imagePullPolicy: IfNotPresent
name: my-app
ports:
- containerPort: 80
name: http
protocol: TCP
dnsPolicy: ClusterFirst
imagePullSecrets:
- name: regcred
restartPolicy: Always
schedulerName: default-scheduler
terminationGracePeriodSeconds: 30
prometheus configmap
apiVersion: v1
data:
prometheus.rules: |-
... some rules
prometheus.yml: |-
global:
scrape_interval: 5s
evaluation_interval: 5s
rule_files:
- /etc/prometheus/prometheus.rules
scrape_configs:
- job_name: prometheus
static_configs:
- targets:
- localhost:9090
- job_name: my-app
metrics_path: /metrics
static_configs:
- targets:
- localhost:8000
- job_name: 'node-exporter'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_endpoints_name]
regex: 'node-exporter'
action: keep
kind: ConfigMap
metadata:
labels:
name: prometheus-config
name: prometheus-config
namespace: monitoring
No need to expose the application outside the cluster.
Leveraging the Kubernetes service discovery, add the job to scrape Services, Pods, or both:
- job_name: 'kubernetes-service-endpoints'
kubernetes_sd_configs:
- role: endpoints
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
regex: (.+)
- regex: __meta_kubernetes_service_label_(.+)
action: labelmap
- regex: 'app_kubernetes_io_(.+)'
action: labeldrop
- regex: 'helm_sh_(.+)'
action: labeldrop
- job_name: 'kubernetes-pods'
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__address__, __meta_kubernetes_pod_annotation_prometheus_io_port]
action: replace
regex: ([^:]+)(?::\d+)?;(\d+)
replacement: $1:$2
target_label: __address__
- source_labels: [__meta_kubernetes_pod_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__meta_kubernetes_namespace]
action: replace
target_label: namespace
regex: (.+)
- source_labels: [__meta_kubernetes_pod_node_name]
action: replace
target_label: host
regex: (.+)
- source_labels: [__meta_kubernetes_pod_name]
action: replace
target_label: pod
regex: (.+)
- regex: __meta_kubernetes_pod_label_(.+)
action: labelmap
- regex: 'app_kubernetes_io_(.+)'
action: labeldrop
- regex: 'helm_sh_(.+)'
action: labeldrop
Then, annotate the Service with:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"
and the Deployment with:
spec:
template:
metadata:
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "80"
prometheus.io/path: "/metrics"
You do not have to expose services, if the promehteus is installed on the same cluster as your app. You can communicate with apps between namespaces by using Kubernetes DNS resolution, going by the rule:
SERVICENAME.NAMESPACE.svc.cluster.local
so one way is to change your prometheus job target to something like this
- job_name: speedtest-ookla
metrics_path: /metrics
static_configs:
- targets:
- 'my_app.default.svc.cluster.local:9000'
And this is the "manual" way. A better approach will be to use prometheus kubernetes_sd_config. It will autodiscover your services and try to scrape them.
Reference: https://prometheus.io/docs/prometheus/latest/configuration/configuration/#kubernetes_sd_config
Is it possible to import/extend an AWS ECS task definition in YAML? For example, assume I have the following task definition in ecs-task-django.yml:
family: 'myproject-django'
executionRoleArn: 'arn:aws:iam::1234567890:role/ecsTaskExecutionRole'
networkMode: 'awsvpc'
requiresCompatibilities:
- FARGATE
cpu: '2048'
memory: '4096'
containerDefinitions:
- name: 'django'
image: '1234567890.dkr.ecr.us-east-1.amazonaws.com/myproject/django'
portMappings:
- containerPort: 5000
hostPort: 5000
protocol: tcp
command:
- '/start'
environment:
- name: 'SOME_VAR'
value: 'value1'
- name: 'SOME_OTHER_VAR'
value: 'value2'
I would like to create a second task definition for a scheduled service that runs using the same image, imports and extends the above config, but does not repeat what I have already defined above. Pseudo-code would be something like ecs-task-dothing.yml:
# import ecs-task-django.yml
containerDefinitions:
- name: 'dothing'
command:
- 'python manage.py dothing'
environment:
- name: 'SOME_VAR'
value: 'value2'
Is this possible, or do I have to basically repeat all the first task definitions and have two instances of those same values (not very DRY)?
I'm trying to follow the official tutorial: https://docs.aws.amazon.com/cdk/latest/guide/getting_started.html
I picked Java but tried Typescript too with the same result: the S3 bucket resource template isn't produced and the deploy doesn't create the bucket.
The code is here: https://github.com/jumarko/aws-experiments/tree/master/cdk/hello-cdk-java
The cdk synth command produces only this CDKMetadata and nothing else:
cd hello-cdk-java
cdk init app --language java
mvn compile
cdk ls
# modify the stack java code
...
# this for some reason only outputs Metadata for me
# even `mvn clean package` doesn't help
cdk synth
Resources:
CDKMetadata:
Type: AWS::CDK::Metadata
Properties:
Modules: aws-cdk=1.61.1,#aws-cdk/cloud-assembly-schema=1.61.1,#aws-cdk/core=1.61.1,#aws-cdk/cx-api=1.61.1,jsii-runtime=Java/14.0.1
Condition: CDKMetadataAvailable
Conditions:
CDKMetadataAvailable:
Fn::Or:
- Fn::Or:
- Fn::Equals:
- Ref: AWS::Region
- ap-east-1
- Fn::Equals:
- Ref: AWS::Region
- ap-northeast-1
- Fn::Equals:
- Ref: AWS::Region
- ap-northeast-2
- Fn::Equals:
- Ref: AWS::Region
- ap-south-1
- Fn::Equals:
- Ref: AWS::Region
- ap-southeast-1
- Fn::Equals:
- Ref: AWS::Region
- ap-southeast-2
- Fn::Equals:
- Ref: AWS::Region
- ca-central-1
- Fn::Equals:
- Ref: AWS::Region
- cn-north-1
- Fn::Equals:
- Ref: AWS::Region
- cn-northwest-1
- Fn::Equals:
- Ref: AWS::Region
- eu-central-1
- Fn::Or:
- Fn::Equals:
- Ref: AWS::Region
- eu-north-1
- Fn::Equals:
- Ref: AWS::Region
- eu-west-1
- Fn::Equals:
- Ref: AWS::Region
- eu-west-2
- Fn::Equals:
- Ref: AWS::Region
- eu-west-3
- Fn::Equals:
- Ref: AWS::Region
- me-south-1
- Fn::Equals:
- Ref: AWS::Region
- sa-east-1
- Fn::Equals:
- Ref: AWS::Region
- us-east-1
- Fn::Equals:
- Ref: AWS::Region
- us-east-2
- Fn::Equals:
- Ref: AWS::Region
- us-west-1
- Fn::Equals:
- Ref: AWS::Region
- us-west-2
Any clue on what's going on or how to debug the issue?
I'm using Mac OS X 10.15.6 with the following CLI versions:
$ aws --version
aws-cli/2.0.10 Python/3.8.2 Darwin/19.6.0 botocore/2.0.0dev14
$ cdk --version
1.61.1 (build 347918f)
I generated a new project from the template and it suddenly started to work.
I'm not sure what changed - I also experimented with using different --profile but at first that didn't work either.
The issue is solved now - if something goes wrong it's worth starting from scratch again!
My deploy to localstack running in Docker (deploy is made via serverless-localstack) fails every time. With version 0.11.1 it worked after 3-5 attempts (without changing anything) and with the latest version there is no change to get the whole stack deployed to localstack. My localstack docker image is the latest.
Deploying one or two functions alone works when the rest is commented out.
It works even if I change the handler of all functions to a simple handler like in some serverless examples. So with an increasing size of the package, the error seems to come up.
My first guess was that there are some timeouts while deploying because the complete package size is relatively big for eight functions and the deployment takes quite long. Another observation: When the error comes, there are 1-2 functions and the resources/streams already deployed but not accessible. Tried to give docker some more resources but that did not help either.
Errors from localstack:
Running CloudFormation stack deployment loop iteration 1
localstack_1 | 2020-05-30T20:31:40:DEBUG:localstack.services.cloudformation.cloudformation_starter: Currently processing stack resource api-dev-dev/GraphQLMainHandlerLambdaVersionOcnwJH5Fo772wd6pU0oM5FhOPeqfRUzVOcuaYd1Oizk: None
localstack_1 | 2020-05-30T20:31:40:DEBUG:localstack.services.cloudformation.cloudformation_starter: Currently processing stack resource api-dev-dev/GraphQLMainHandlerLambdaFunction: True
localstack_1 | 2020-05-30T20:31:40:ERROR:localstack.services.cloudformation.cloudformation_starter: Unable to parse and create resource "GraphQLMainHandlerLambdaVersionOcnwJH5Fo772wd6pU0oM5FhOPeqfRUzVOcuaYd1Oizk": 'FunctionName' Traceback (most recent call last):
Unable to parse and create resource "GraphQLMainHandlerLambdaFunction": An error occurred (ResourceConflictException) when calling the CreateFunction operation: Function already exist: api-dev-dev-graphQLMainHandler Traceback (most recent call last)
...
botocore.errorfactory.ResourceConflictException: An error occurred (ResourceConflictException) when calling the CreateFunction operation: Function already exist: api-dev-dev-graphQLMainHandler
+Many more with the same reasons.
serverless.yaml:
service:
name: api-${self:provider.stage}
provider:
name: aws
runtime: nodejs12.x
stage: ${opt:stage, 'dev'}
region: ${opt:region, 'eu-central-1'}
profile: dev-local
environment:
STAGE: ${self:provider.stage}
MAIN_TABLE_NAME: ${self:custom.mainTableName}
MAIN_BUCKET_NAME: ${self:custom.mainBucketName}
#MAIN_TABLE_STREAM_ARN:
#Fn::GetAtt:
#- MainTable
#- StreamArn
#DYNAMODB_REGION: ${self:provider.region}
#AWS_PROFILE: ${self:provider.profile}
AWS_REGION: ${self:provider.region}
package:
include:
- src/**
functions:
graphQLMainHandler:
handler: src/handler/graphQLMainHandler.graphQLMainHandler
events:
- http:
method: post
path: graphql-main
cors: true
dynamoDBStreamHandler:
handler: src/handler/dynamoDBStreamHandler.dynamoDBStreamHandler
events:
- stream:
type: dynamodb
arn:
Fn::GetAtt:
- MainTable
- StreamArn
batchSize: 10
startingPosition: TRIM_HORIZON
pictureHandler:
handler: src/handler/pictureHandler.pictureHandler
events:
- http:
method: get
path: picture
cors: true
taskHandler:
handler: src/handler/taskHandler.taskHandler
updateMainTableElasticsearchMappings:
handler: src/handler/dev.updateMainTableElasticsearchMappings
createMainTableElasticsearchIndex:
handler: src/handler/dev.createMainTableElasticsearchIndex
analysisTaskRunner:
handler: src/handler/taskRunner.analysisTaskRunner
sendNotificationsTaskRunner:
handler: src/handler/taskRunner.sendNotificationsTaskRunner
plugins:
- serverless-webpack
- serverless-localstack
- serverless-offline
resources:
Resources:
MainTable:
Type: AWS::DynamoDB::Table
Properties:
TableName: ${self:custom.mainTableName}
AttributeDefinitions:
- AttributeName: id
AttributeType: S
KeySchema:
- AttributeName: id
KeyType: HASH
BillingMode: PAY_PER_REQUEST
StreamSpecification:
StreamViewType: NEW_IMAGE
MainBucket:
Type: AWS::S3::Bucket
Properties:
BucketName: ${self:custom.mainBucketName}
AccessControl: PublicRead
CorsConfiguration:
CorsRules:
- AllowedMethods:
- GET
AllowedOrigins:
- "*"
AllowedHeaders:
- "*"
custom:
serverless-offline:
host: 0.0.0.0
port: 3000
webpack:
packager: yarn
webpackConfig: ./webpack.config.js
includeModules: true
localstack:
debug: true
stages:
- dev
autostart: false
#lambda:
#mountCode: false
mainTableName: main-${self:provider.stage}
mainBucketName: main-${self:provider.stage}
docker-compose.dev.yml:
version: '3'
services:
localstack:
privileged: true
image: localstack/localstack
environment:
- EDGE_PORT=4566
- DEFAULT_REGION=eu-central-1
- AWS_DEFAULT_REGION=eu-central-1
- DEBUG=1
- COMPOSE_PARALLEL_LIMIT=100
- LAMBDA_EXECUTOR=docker-reuse
- DOCKER_HOST=unix:///var/run/docker.sock
- USE_SSL=1
volumes:
- "${TMPDIR:-/tmp/localstack}:/tmp/localstack"
- "/var/run/docker.sock:/var/run/docker.sock"
ports:
- "4566-4599:4566-4599"
(...)
Let say I have an app that has the ability to run in a dry run mode, set by a flag on the command line; myapp --dryrun, and the CloudForm for its task definition is:
MyTaskDefinition
Type: AWS::ECS::TaskDefinition
Properties:
ContainerDefinitions:
- Name: myApp
Image: user/myapp:latest
Command:
- ./myapp
- --dryrun
Environment:
- Name: SOME_ENV_VAR
Value: !Ref SomeEnvVar
I am trying to create a single CloudForm template for a Task Definition that can be used in a development and production environment, where the dry run flag is set only for the development environment.
Is there some way to set conditional command, or am I going to resort to a hacky string that I pass in like:
Command:
- ./myapp
- !Ref DryRun
The neatest solution was to use an If function, which sets the flag when true, and uses AWS::NoValue if false.
AWS::NoValue will remove the property completely, meaning the command for true is ["./myapp", "--dryrun"] and the command when false is ["./myapp"].
Unfortunately, there is no Bool type for CloudForm parameters, and so you have to pass this in as a String, and then use a Condition to convert from the String to a Bool.
MyTaskDefinition
Type: AWS::ECS::TaskDefinition
Properties:
ContainerDefinitions:
- Name: myApp
Image: user/myapp:latest
Command:
- ./myapp
- Fn::If:
- UseDryRun
- --dryrun
- Ref: AWS::NoValue
Parameters:
DryRun:
Type: String
AllowedValues: # No bool parameter in CFN
- true
- false
Conditions:
UseDryRun: !Equals [ !Ref DryRun, true ]