Confluent Schema Registry on Strimzi - pods not getting created - google-cloud-platform

I've Strimzi Kafka installed on GKE(GCP), and i'm trying to install Confluent Schema registry referring link -
https://github.com/lsst-sqre/strimzi-registry-operator
Steps followed:
Installed strimzi-registry-operator in namespace schema-registry-operator,
Note : Strimzi Kafka is installed in namespace - kafka
Command used :
helm repo add lsstsqre https://lsst-sqre.github.io/charts/
helm repo update
helm install ssr lsstsqre/strimzi-registry-operator -n schema-registry-operator --values values.yaml
values.yaml:
------------
# -- Name of the Strimzi Kafka cluster
clusterName: "versa-kafka-gke"
# -- Namespace where the Strimzi Kafka cluster is deployed
clusterNamespace: "kafka"
# -- Namespace where the strimzi-registry-operator is deployed
operatorNamespace: "strimzi-registry-operator"
Step 2:
Installed the kafkatopic (registry-schemas),kafkauser in schema - 'kafka'
(Note : Strimzi kafka is also installed in the namespace - kafka)
kafkatopic.yaml
----------------
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaTopic
metadata:
name: registry-schemas
labels:
strimzi.io/cluster: versa-kafka-gke
spec:
partitions: 1
replicas: 3
config:
# http://kafka.apache.org/documentation/#topicconfigs
cleanup.policy: compact
kafkauser.yaml
--------------
apiVersion: kafka.strimzi.io/v1beta2
kind: KafkaUser
metadata:
name: confluent-schema-registry
labels:
strimzi.io/cluster: versa-kafka-gke
spec:
authentication:
type: tls
authorization:
# Official docs on authorizations required for the Schema Registry:
# https://docs.confluent.io/current/schema-registry/security/index.html#authorizing-access-to-the-schemas-topic
type: simple
acls:
# Allow all operations on the registry-schemas topic
# Read, Write, and DescribeConfigs are known to be required
- resource:
type: topic
name: registry-schemas
patternType: literal
operation: All
type: allow
# Allow all operations on the schema-registry* group
- resource:
type: group
name: schema-registry
patternType: prefix
operation: All
type: allow
# Allow Describe on the __consumer_offsets topic
- resource:
type: topic
name: __consumer_offsets
patternType: literal
operation: Describe
type: allow
Step3 :
Installed StrimziSchemaRegistry in schema - strimzi-schema-operator
Here is what i see in the schema svchema-registry-operator:
(base) Karans-MacBook-Pro:schema-registry-yamls karanalang$ kc get all -n schema-registry-operator
NAME READY STATUS RESTARTS AGE
pod/strimzi-registry-operator-7867fbc985-rddqw 1/1 Running 0 121m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/strimzi-registry-operator 1/1 1 1 121m
NAME DESIRED CURRENT READY AGE
replicaset.apps/strimzi-registry-operator-7867fbc985 1 1 1 121m
Also, when i logon to the SchemaRegistryOperator pod, i see the following error.
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 108, in guard
await coro
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/reactor/queueing.py", line 175, in watcher
async for raw_event in stream:
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/watching.py", line 82, in infinite_watch
async for raw_event in stream:
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/watching.py", line 159, in continuous_watch
objs, resource_version = await fetching.list_objs(
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/fetching.py", line 28, in list_objs
rsp = await api.get(
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/api.py", line 111, in get
response = await request(
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
return await fn(*args, **kwargs, context=context)
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/api.py", line 85, in request
await errors.check_response(response) # but do not parse it!
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIForbiddenError: ('secrets is forbidden: User "system:serviceaccount:schema-registry-operator:strimzi-registry-operator" cannot list resource "secrets" in API group "" in the namespace "kafka"', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'secrets is forbidden: User "system:serviceaccount:schema-registry-operator:strimzi-registry-operator" cannot list resource "secrets" in API group "" in the namespace "kafka"', 'reason': 'Forbidden', 'details': {'kind': 'secrets'}, 'code': 403})
[2022-11-30 23:27:39,605] kopf._cogs.clients.w [DEBUG ] Stopping the watch-stream for strimzischemaregistries.v1beta1.roundtable.lsst.codes in 'kafka'.
[2022-11-30 23:27:39,606] kopf._core.reactor.o [ERROR ] Watcher for strimzischemaregistries.v1beta1.roundtable.lsst.codes#kafka has failed: ('strimzischemaregistries.roundtable.lsst.codes is forbidden: User "system:serviceaccount:schema-registry-operator:strimzi-registry-operator" cannot list resource "strimzischemaregistries" in API group "roundtable.lsst.codes" in the namespace "kafka"', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'strimzischemaregistries.roundtable.lsst.codes is forbidden: User "system:serviceaccount:schema-registry-operator:strimzi-registry-operator" cannot list resource "strimzischemaregistries" in API group "roundtable.lsst.codes" in the namespace "kafka"', 'reason': 'Forbidden', 'details': {'group': 'roundtable.lsst.codes', 'kind': 'strimzischemaregistries'}, 'code': 403})
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/errors.py", line 148, in check_response
response.raise_for_status()
File "/opt/venv/lib/python3.10/site-packages/aiohttp/client_reqrep.py", line 1004, in raise_for_status
raise ClientResponseError(
aiohttp.client_exceptions.ClientResponseError: 403, message='Forbidden', url=URL('https://10.44.0.1:443/apis/roundtable.lsst.codes/v1beta1/namespaces/kafka/strimzischemaregistries')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/aiokits/aiotasks.py", line 108, in guard
await coro
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/reactor/queueing.py", line 175, in watcher
async for raw_event in stream:
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/watching.py", line 82, in infinite_watch
async for raw_event in stream:
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/watching.py", line 159, in continuous_watch
objs, resource_version = await fetching.list_objs(
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/fetching.py", line 28, in list_objs
rsp = await api.get(
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/api.py", line 111, in get
response = await request(
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/auth.py", line 45, in wrapper
return await fn(*args, **kwargs, context=context)
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/api.py", line 85, in request
await errors.check_response(response) # but do not parse it!
File "/opt/venv/lib/python3.10/site-packages/kopf/_cogs/clients/errors.py", line 150, in check_response
raise cls(payload, status=response.status) from e
kopf._cogs.clients.errors.APIForbiddenError: ('strimzischemaregistries.roundtable.lsst.codes is forbidden: User "system:serviceaccount:schema-registry-operator:strimzi-registry-operator" cannot list resource "strimzischemaregistries" in API group "roundtable.lsst.codes" in the namespace "kafka"', {'kind': 'Status', 'apiVersion': 'v1', 'metadata': {}, 'status': 'Failure', 'message': 'strimzischemaregistries.roundtable.lsst.codes is forbidden: User "system:serviceaccount:schema-registry-operator:strimzi-registry-operator" cannot list resource "strimzischemaregistries" in API group "roundtable.lsst.codes" in the namespace "kafka"', 'reason': 'Forbidden', 'details': {'group': 'roundtable.lsst.codes', 'kind': 'strimzischemaregistries'}, 'code': 403})
Few questions on this:
I don't see the schemaRegistry pod getting created which would be listening on port 8081, only the StrimziSchemaRegistry object is created in schema - strimzi-schema-operator
How do i get access to the SchemaRegistry url so i can upload schemas to it ?
How do i resolve the permission error above ?
Do I need to create a separate service account for installinh schema-registry ?
Pls advise.
tia!
Update :
This is an existing issue with the Strimzi Schema Registry operator (https://github.com/lsst-sqre/strimzi-registry-operator/issues/79).
Essentially, the ServiceAccount is not created in the correct namespace, I re-created the ServiceAccount in namespace - strimzi-registry-operator to resolve the issue.
However, i'm facing another issue (existing issue - https://github.com/lsst-sqre/strimzi-registry-operator/issues/84), the schema registry is not getting created.
Additional Details :
Schema-Registry-operator is deployed in namespace - 'strimzi-registry-operator'
Strimzi Kafka(cluster - versa-kafka-gke) - deployed in namespace - 'kafka'
Part of the Strimzi kafka yaml, with version & listeners :
apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: versa-kafka-gke #1
spec:
kafka:
version: 3.0.0
replicas: 3
listeners:
- name: plain
port: 9092
type: internal
tls: false
- name: tls
port: 9093
type: internal
tls: true
authentication:
type: tls
- name: external
port: 9094
type: loadbalancer
tls: true
authentication:
type: tls
authorization:
type: simple
KafkaUser (confluent-schema-registry) & KafkaTopic (registry-schemas), deployed in namespace - 'kafka'
Confluent Schema Registry - deployed in namespace - 'kafka' (
Error :
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Error Logging 73s kopf Handler 'create_registry' failed with an exception. Will retry.
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 279, in execute_handler_once
result = await invoke_handler(
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 374, in invoke_handler
result = await invocation.invoke(
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/invocation.py", line 139, in invoke
await asy...al/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/venv/lib/python3.10/site-packages/strimziregistryoperator/handlers/createregistry.py", line 131, in create_registry
bootstrap_server = get_kafka_bootstrap_server(
File "/opt/venv/lib/python3.10/site-packages/strimziregistryoperator/deployments.py", line 83, in get_kafka_bootstrap_server
raise kopf.Error(msg, delay=10)
AttributeError: module 'kopf' has no attribute 'Error'
Normal Logging 73s kopf Creating a new Schema Registry deployment: confluent-schema-registry with listener=tls (security protocol=tls) and strimzi-version=v1beta2 serviceType=ClusterIP image=confluentinc/cp-schema-registry:7.2.1
Normal Logging 12s kopf Creating a new Schema Registry deployment: confluent-schema-registry with listener=tls (security protocol=tls) and strimzi-version=v1beta2 serviceType=ClusterIP image=confluentinc/cp-schema-registry:7.2.1
Error Logging 12s kopf Handler 'create_registry' failed with an exception. Will retry.
Traceback (most recent call last):
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 279, in execute_handler_once
result = await invoke_handler(
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/execution.py", line 374, in invoke_handler
result = await invocation.invoke(
File "/opt/venv/lib/python3.10/site-packages/kopf/_core/actions/invocation.py", line 139, in invoke
await asy...al/lib/python3.10/concurrent/futures/thread.py", line 58, in run
result = self.fn(*self.args, **self.kwargs)
File "/opt/venv/lib/python3.10/site-packages/strimziregistryoperator/handlers/createregistry.py", line 131, in create_registry
bootstrap_server = get_kafka_bootstrap_server(
File "/opt/venv/lib/python3.10/site-packages/strimziregistryoperator/deployments.py", line 83, in get_kafka_bootstrap_server
raise kopf.Error(msg, delay=10)
AttributeError: module 'kopf' has no attribute 'Error'

Related

S3 file not downloaded when triggering a Lambda function associated with EFS

I'm using the Serverless framework to create a Lambda function that, when triggered by an S3 upload (uploading test.vcf to s3://trigger-test/uploads/), downloads that uploaded file from S3 to EFS (specifically to the /mnt/efs/vcfs/ folder). I'm pretty new to EFS and followed AWS documentation for setting up the EFS access point, but when I deploy this application and upload a test file to trigger the Lambda function, it fails to download the file and gives this error in the CloudWatch logs:
[ERROR] FileNotFoundError: [Errno 2] No such file or directory: '/mnt/efs/vcfs/test.vcf.A0bA45dC'
Traceback (most recent call last):
File "/var/task/handler.py", line 21, in download_files_to_efs
result = s3.download_file('trigger-test', key, efs_loci)
File "/var/runtime/boto3/s3/inject.py", line 170, in download_file
return transfer.download_file(
File "/var/runtime/boto3/s3/transfer.py", line 307, in download_file
future.result()
File "/var/runtime/s3transfer/futures.py", line 106, in result
return self._coordinator.result()
File "/var/runtime/s3transfer/futures.py", line 265, in result
raise self._exception
File "/var/runtime/s3transfer/tasks.py", line 126, in __call__
return self._execute_main(kwargs)
File "/var/runtime/s3transfer/tasks.py", line 150, in _execute_main
return_value = self._main(**kwargs)
File "/var/runtime/s3transfer/download.py", line 571, in _main
fileobj.seek(offset)
File "/var/runtime/s3transfer/utils.py", line 367, in seek
self._open_if_needed()
File "/var/runtime/s3transfer/utils.py", line 350, in _open_if_needed
self._fileobj = self._open_function(self._filename, self._mode)
File "/var/runtime/s3transfer/utils.py", line 261, in open
return open(filename, mode)
My hunch is that this has to do with the local mount path specified in the Lambda function versus the Root directory path in the Details portion of the EFS access point configuration. Ultimately, I want the test.vcf file I upload to S3 to be downloaded to the EFS folder: /mnt/efs/vcfs/.
Relevant files:
serverless.yml:
service: LambdaEFS-trigger-test
frameworkVersion: '2'
provider:
name: aws
runtime: python3.8
stage: dev
region: us-west-2
vpc:
securityGroupIds:
- sg-XXXXXXXX
- sg-XXXXXXXX
- sg-XXXXXXXX
subnetIds:
- subnet-XXXXXXXXXX
functions:
cfnPipelineTrigger:
handler: handler.download_files_to_efs
description: Lambda to download S3 file to EFS folder.
events:
- s3:
bucket: trigger-test
event: s3:ObjectCreated:*
rules:
- prefix: uploads/
- suffix: .vcf
existing: true
fileSystemConfig:
localMountPath: /mnt/efs
arn: arn:aws:elasticfilesystem:us-west-2:XXXXXXXXXX:access-point/fsap-XXXXXXX
iamRoleStatements:
- Effect: Allow
Action:
- s3:ListBucket
Resource:
- arn:aws:s3:::trigger-test
- Effect: Allow
Action:
- s3:GetObject
- s3:GetObjectVersion
Resource:
- arn:aws:s3:::trigger-test/uploads/*
- Effect: Allow
Action:
- elasticfilesystem:ClientMount
- elasticfilesystem:ClientWrite
- elasticfilesystem:ClientRootAccess
Resource:
- arn:aws:elasticfilesystem:us-west-2:XXXXXXXXXX:file-system/fs-XXXXXX
plugins:
- serverless-iam-roles-per-function
package:
individually: true
exclude:
- '**/*'
include:
- handler.py
handler.py:
import json
import boto3
s3 = boto3.client('s3', region_name = 'us-west-2')
def download_files_to_efs(event, context):
"""
Locates the S3 file name (i.e. S3 object "key" value) the initiated the Lambda call, then downloads the file
into the locally attached EFS drive at the target location.
:param: event | S3 event record
:return: dict
"""
print(event)
key = event.get('Records')[0].get('s3').get('object').get('key') # bucket: trigger-test, key: uploads/test.vcf
efs_loci = f"/mnt/efs/vcfs/{key.split('/')[-1]}" # '/mnt/efs/vcfs/test.vcf
print("key: %s, efs_loci: %s" % (key, efs_loci))
result = s3.download_file('trigger-test', key, efs_loci)
if result:
print('Download Success...')
else:
print('Download failed...')
return { 'status_code': 200 }
EFS Access Point details:
Details
Root directory path: /vcfs
POSIX
USER ID: 1000
Group ID: 1000
Root directory creation permissions
Owner User ID: 1000
Owner Group ID: 1000
POSIX permissions to apply to the root directory path: 777
Your local path is localMountPath: /mnt/efs. So in your code you should be using only this path (not /mnt/efs/vcfs):
efs_loci = f"/mnt/efs/{key.split('/')[-1]}" # '/mnt/efs/test.vcf

data not being updated in aws when using kinesis data stream

Im trying to implement kinesis data streams into my react native application using amplify.
I have followed the steps as stated in the official documentation, but no data is being updated when I check aws page.
This is my code
Analytics.addPluggable(new AWSKinesisProvider())
Analytics.configure({
AWSKinesis: {
// OPTIONAL - Amazon Kinesis service region
region: 'us-east-2',
// OPTIONAL - The buffer size for events in number of items.
bufferSize: 1000,
// OPTIONAL - The number of events to be deleted from the buffer when flushed.
flushSize: 100,
// OPTIONAL - The interval in milliseconds to perform a buffer check and flush if necessary.
flushInterval: 5000, // 5s
// OPTIONAL - The limit for failed recording retries.
resendLimit: 5
}
})
Analytics.record({
data: {
mydata:"new data"
},
streamName: '*myStreamName*'
}, 'AWSKinesis');
After executing this is how my console from debugger looks like this
LOG [DEBUG] 31:18.860 Credentials - getting credentials
LOG [DEBUG] 31:18.862 Credentials - picking up credentials
LOG [DEBUG] 31:18.863 Credentials - getting new cred promise
LOG [DEBUG] 31:18.866 Credentials - checking if credentials exists and not expired
LOG [DEBUG] 31:18.868 Credentials - are these credentials expired? {"accessKeyId": "****", "authenticated": true, "expiration": 2020-11-03T11:29:53.000Z, "identityId": "****", "secretAccessKey": "****", "sessionToken": "****"}
LOG [DEBUG] 31:18.871 Credentials - credentials not changed and not expired, directly return
LOG [DEBUG] 31:18.905 AWSKinesisProvider - set credentials for analytics undefined
LOG [DEBUG] 31:20.576 AWSKinesisProvider - init clients
LOG [DEBUG] 31:20.579 AWSKinesisProvider - initialize kinesis with credentials {"accessKeyId": "****", "authenticated": true, "identityId": "****", "secretAccessKey": "****", "sessionToken": "****"}
LOG [DEBUG] 31:20.624 AWSKinesisProvider - putting records to kinesis with records [{"Data": [123, 34, 109, 121, 100, 97, 116, 97, 34, 58, 34, 104, 101, 108, 108,
111, 32, 105, 109, 32, 110, 101, 119, 32, 104, 101, 114, 101, 34, 125], "PartitionKey": "partition-us-east-2:****"}]
LOG [DEBUG] 31:22.311 AWSKinesisProvider - Upload records to stream *myStreamName*

RunAsUser issue & Clicking external IP of load balancer -> Bad Request (400) on deploying Django app on GKE (Kubernetes) and db connection failing:

Probem:
I have 2 replicas for my app, library. I have configured a service to talk to my two replicas. I have a dockerfile that starts my app at port # 8080. The load balancer and the app containers all seem to be running. However, I am not able to connect to them. Every time I click on the external IP shown I get "Bad Request (400)"
This log from the container seems to indicate somethingw ith the db connection going wrong...
2019-12-07 07:33:56.669 IST
Traceback (most recent call last): File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py",
line 217, in ensure_connection self.connect() File
"/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py",
line 195, in connect self.connection =
self.get_new_connection(conn_params) File
"/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py",
line 178, in get_new_connection connection =
Database.connect(**conn_params) File
"/usr/local/lib/python3.8/site-packages/psycopg2/init.py", line
126, in connect conn = _connect(dsn,
connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: could not connect to server: Connection
refused
$ kubectl get services
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.19.240.1 <none> 443/TCP 61m
library-svc LoadBalancer 10.19.254.164 34.93.141.11 80:30227/TCP 50m
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
libary-6f9b45fcdb-g5xfv 1/1 Running 0 54m
library-745d6798d8-m45gq 3/3 Running 0 12m
In settings.py
ALLOWED_HOSTS = ['library-259506.appspot.com', '127.0.0.1', 'localhost', '*']
Here is the library.yaml file I have to go with it.
# [START kubernetes_deployment]
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: library
labels:
app: library
spec:
replicas: 2
template:
metadata:
labels:
app: library
spec:
containers:
- name: library-app
# Replace with your project ID or use `make template`
image: gcr.io/library-259506/library
# This setting makes nodes pull the docker image every time before
# starting the pod. This is useful when debugging, but should be turned
# off in production.
imagePullPolicy: Always
env:
# [START cloudsql_secrets]
- name: DATABASE_USER
valueFrom:
secretKeyRef:
name: cloudsql
key: username
- name: DATABASE_PASSWORD
valueFrom:
secretKeyRef:
name: cloudsql
key: password
# [END cloudsql_secrets]
ports:
- containerPort: 8080
# [START proxy_container]
- image: gcr.io/cloudsql-docker/gce-proxy:1.16
name: cloudsql-proxy
command: ["/cloud_sql_proxy", "--dir=/cloudsql",
"-instances=library-259506:asia-south1:library=tcp:3306",
"-credential_file=/secrets/cloudsql/credentials.json"]
volumeMounts:
- name: cloudsql-oauth-credentials
mountPath: /secrets/cloudsql
readOnly: true
- name: ssl-certs
mountPath: /etc/ssl/certs
- name: cloudsql
mountPath: /cloudsql
# [END proxy_container]
# [START volumes]
volumes:
- name: cloudsql-oauth-credentials
secret:
secretName: cloudsql-oauth-credentials
- name: ssl-certs
hostPath:
path: /etc/ssl/certs
- name: cloudsql
emptyDir:
# [END volumes]
# [END kubernetes_deployment]
---
# [START service]
# The library-svc service provides a load-balancing proxy over the polls app
# pods. By specifying the type as a 'LoadBalancer', Container Engine will
# create an external HTTP load balancer.
# The service directs traffic to the deployment by matching the service's selector to the deployment's label
#
# For more information about external HTTP load balancing see:
# https://cloud.google.com/container-engine/docs/load-balancer
apiVersion: v1
kind: Service
metadata:
name: library-svc
spec:
type: LoadBalancer
ports:
- port: 80
targetPort: 8080
selector:
app: library
# [END service]
The Dockerfile:
FROM python:3
ENV PYTHONUNBUFFERED 1
RUN mkdir /code
WORKDIR /code
COPY requirements.txt /code/
RUN pip install -r requirements.txt
COPY . /code/
# Server
EXPOSE 8080
STOPSIGNAL SIGINT
ENTRYPOINT ["python", "manage.py"]
CMD ["runserver", "0.0.0.0:8080"]
library-app container logs
2019-12-07T02:03:58.742639999Z Performing system checks...
I
2019-12-07T02:03:58.742701271Z
I
2019-12-07T02:03:58.816567541Z System check identified no issues (0 silenced).
I
2019-12-07T02:03:59.338790311Z December 07, 2019 - 02:03:59
I
2019-12-07T02:03:59.338986187Z Django version 2.2.6, using settings 'locallibrary.settings'
I
2019-12-07T02:03:59.338995688Z Starting development server at http://0.0.0.0:8080/
I
2019-12-07T02:03:59.338999467Z Quit the server with CONTROL-C.
I
2019-12-07T02:04:00.814238478Z [07/Dec/2019 02:04:00] "GET / HTTP/1.1" 400 26
E
undefined
library container logs:
2019-12-07T02:03:56.568839208Z Performing system checks...
I
2019-12-07T02:03:56.568912262Z
I
2019-12-07T02:03:56.624835039Z System check identified no issues (0 silenced).
I
2019-12-07T02:03:56.669088750Z Exception in thread django-main-thread:
E
2019-12-07T02:03:56.669204639Z Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
self.connect()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 195, in connect
self.connection = self.get_new_connection(conn_params)
File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
connection = Database.connect(**conn_params)
File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 126, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
psycopg2.OperationalError: could not connect to server: Connection refused
E
2019-12-07T02:03:56.672779570Z Is the server running on host "127.0.0.1" and accepting
E
2019-12-07T02:03:56.672783910Z TCP/IP connections on port 3306?
E
2019-12-07T02:03:56.672826889Z
E
2019-12-07T02:03:56.672903098Z
E
2019-12-07T02:03:56.672909494Z The above exception was the direct cause of the following exception:
E
2019-12-07T02:03:56.672913216Z
E
2019-12-07T02:03:56.672962576Z Traceback (most recent call last):
File "/usr/local/lib/python3.8/threading.py", line 932, in _bootstrap_inner
self.run()
File "/usr/local/lib/python3.8/threading.py", line 870, in run
self._target(*self._args, **self._kwargs)
File "/usr/local/lib/python3.8/site-packages/django/utils/autoreload.py", line 54, in wrapper
fn(*args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/django/core/management/commands/runserver.py", line 120, in inner_run
self.check_migrations()
File "/usr/local/lib/python3.8/site-packages/django/core/management/base.py", line 453, in check_migrations
executor = MigrationExecutor(connections[DEFAULT_DB_ALIAS])
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/executor.py", line 18, in __init__
self.loader = MigrationLoader(self.connection)
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 49, in __init__
self.build_graph()
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/loader.py", line 212, in build_graph
self.applied_migrations = recorder.applied_migrations()
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 73, in applied_migrations
if self.has_table():
File "/usr/local/lib/python3.8/site-packages/django/db/migrations/recorder.py", line 56, in has_table
return self.Migration._meta.db_table in self.connection.introspection.table_names(self.connection.cursor())
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 256, in cursor
return self._cursor()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 233, in _cursor
self.ensure_connection()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
self.connect()
File "/usr/local/lib/python3.8/site-packages/django/db/utils.py", line 89, in __exit__
raise dj_exc_value.with_traceback(traceback) from exc_value
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 217, in ensure_connection
self.connect()
File "/usr/local/lib/python3.8/site-packages/django/db/backends/base/base.py", line 195, in connect
self.connection = self.get_new_connection(conn_params)
File "/usr/local/lib/python3.8/site-packages/django/db/backends/postgresql/base.py", line 178, in get_new_connection
connection = Database.connect(**conn_params)
File "/usr/local/lib/python3.8/site-packages/psycopg2/__init__.py", line 126, in connect
conn = _connect(dsn, connection_factory=connection_factory, **kwasync)
django.db.utils.OperationalError: could not connect to server: Connection refused
E
2019-12-07T02:03:56.684080787Z Is the server running on host "127.0.0.1" and accepting
E
2019-12-07T02:03:56.684085077Z TCP/IP connections on port 3306?
E
Cloud SQL proxy container logs:
2019-12-07T02:03:57.648625670Z 2019/12/07 02:03:57 current FDs rlimit set to 1048576, wanted limit is 8500. Nothing to do here.
E
2019-12-07T02:03:57.660316236Z 2019/12/07 02:03:57 using credential file for authentication; email=library-service#library-259506.iam.gserviceaccount.com
E
2019-12-07T02:03:58.167649397Z 2019/12/07 02:03:58 Listening on 127.0.0.1:3306 for library-259506:asia-south1:library
E
2019-12-07T02:03:58.167761109Z 2019/12/07 02:03:58 Ready for new connections
E
2019-12-07T02:03:58.865747581Z 2019/12/07 02:03:58 New connection for "library-259506:asia-south1:library"
E
2019-12-07T03:03:29.000559014Z 2019/12/07 03:03:29 ephemeral certificate for instance library-259506:asia-south1:library will expire soon, refreshing now.
E
2019-12-07T04:02:59.004152307Z 2019/12/07 04:02:59 ephemeral certificate for instance library-259506:asia-south1:library will expire soon, refreshing now.
E
I can't believe it the issue was my library.yaml file was missing this for the cloud proxy. I believe "RunAs" -- Controls which user ID the containers are run with -- this forces it to run as daemon user and allows us specify RunAsUser which can be overridden by RunAsUser in SecurityContext on a per Container basis.
https://cloud.google.com/solutions/best-practices-for-operating-containers#avoid_running_as_root
# [START cloudsql_security_context]
securityContext:
runAsUser: 2 # non-root user
allowPrivilegeEscalation: false
# [END cloudsql_security_context]
From docs https://kubernetes.io/docs/concepts/policy/pod-security-policy/#users-and-groups: "MustRunAsNonRoot - Requires that the pod be submitted with a non-zero runAsUser or have the USER directive defined (using a numeric UID) in the image. Pods which have specified neither runAsNonRoot nor runAsUser settings will be mutated to set runAsNonRoot=true, thus requiring a defined non-zero numeric USER directive in the container. No default provided. Setting allowPrivilegeEscalation=false is strongly recommended with this strategy."

Unable to run ansible playbook command - potential Authentication error (msrest)

I have recently started using Ansible to automate the deployment of a docker image to an Azure Kubernetes Service.
I have an ansible file called azure_create_aks.yml. I am running the following command on my mac ansible-playbook azure_create_aks.yml but it fails with the following (snippet from the stack trace):
msrest.exceptions.AuthenticationError: , AdalError: Get Token request returned http error: 400 and server response: Bad Request
I've tried uninstalling ansible and azure-cli and reinstalled using the following:
- brew update && brew install azure-cli
- az aks install-cli
- pip3 install ansible[azure]
I also tried uninstalling python 3 so that it would be using python 2 instead. From looking around on Stack Overflow, I think i might be encountering a dependency issue with msrestazure or possible an issue with the version of pip or python I have on my local.
After running ansible-playbook azure_create_aks.yml, I get the following:
[WARNING]: No inventory was parsed, only implicit localhost is available
[WARNING]: provided hosts list is empty, only localhost is available. Note that the implicit localhost does not match 'all'
PLAY [Create Azure Kubernetes Service] *********************************************************************************************************************************************************************
TASK [Gathering Facts] *************************************************************************************************************************************************************************************
ok: [localhost]
TASK [Create resource group] *******************************************************************************************************************************************************************************
fatal: [localhost]: FAILED! => {"changed": false, "module_stderr": "/Users/hughej/.ansible/tmp/ansible-tmp-1569328685.354382-6386128387997/AnsiballZ_azure_rm_resourcegroup.py:18: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses\n import imp\nTraceback (most recent call last):\n File \"/usr/local/lib/python3.7/site-packages/msrestazure/azure_active_directory.py\", line 366, in set_token\n self.secret\n File \"/usr/local/lib/python3.7/site-packages/adal/authentication_context.py\", line 179, in acquire_token_with_client_credentials\n return self._acquire_token(token_func)\n File \"/usr/local/lib/python3.7/site-packages/adal/authentication_context.py\", line 128, in _acquire_token\n return token_func(self)\n File \"/usr/local/lib/python3.7/site-packages/adal/authentication_context.py\", line 177, in token_func\n return token_request.get_token_with_client_credentials(client_secret)\n File \"/usr/local/lib/python3.7/site-packages/adal/token_request.py\", line 310, in get_token_with_client_credentials\n token = self._oauth_get_token(oauth_parameters)\n File \"/usr/local/lib/python3.7/site-packages/adal/token_request.py\", line 112, in _oauth_get_token\n return client.get_token(oauth_parameters)\n File \"/usr/local/lib/python3.7/site-packages/adal/oauth2_client.py\", line 289, in get_token\n raise AdalError(return_error_string, error_response)\nadal.adal_error.AdalError: Get Token request returned http error: 400 and server response: Bad Request\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/Users/hughej/.ansible/tmp/ansible-tmp-1569328685.354382-6386128387997/AnsiballZ_azure_rm_resourcegroup.py\", line 114, in <module>\n _ansiballz_main()\n File \"/Users/hughej/.ansible/tmp/ansible-tmp-1569328685.354382-6386128387997/AnsiballZ_azure_rm_resourcegroup.py\", line 106, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/Users/hughej/.ansible/tmp/ansible-tmp-1569328685.354382-6386128387997/AnsiballZ_azure_rm_resourcegroup.py\", line 49, in invoke_module\n imp.load_module('__main__', mod, module, MOD_DESC)\n File \"/usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/imp.py\", line 234, in load_module\n return load_source(name, filename, file)\n File \"/usr/local/Cellar/python/3.7.4_1/Frameworks/Python.framework/Versions/3.7/lib/python3.7/imp.py\", line 169, in load_source\n module = _exec(spec, sys.modules[name])\n File \"<frozen importlib._bootstrap>\", line 630, in _exec\n File \"<frozen importlib._bootstrap_external>\", line 728, in exec_module\n File \"<frozen importlib._bootstrap>\", line 219, in _call_with_frames_removed\n File \"/var/folders/2t/30gk2pfx5_n08tfd45g3v674b8c1y8/T/ansible_azure_rm_resourcegroup_payload_6hqj1_fs/__main__.py\", line 266, in <module>\n File \"/var/folders/2t/30gk2pfx5_n08tfd45g3v674b8c1y8/T/ansible_azure_rm_resourcegroup_payload_6hqj1_fs/__main__.py\", line 262, in main\n File \"/var/folders/2t/30gk2pfx5_n08tfd45g3v674b8c1y8/T/ansible_azure_rm_resourcegroup_payload_6hqj1_fs/__main__.py\", line 144, in __init__\n File \"/var/folders/2t/30gk2pfx5_n08tfd45g3v674b8c1y8/T/ansible_azure_rm_resourcegroup_payload_6hqj1_fs/ansible_azure_rm_resourcegroup_payload.zip/ansible/module_utils/azure_rm_common.py\", line 318, in __init__\n File \"/var/folders/2t/30gk2pfx5_n08tfd45g3v674b8c1y8/T/ansible_azure_rm_resourcegroup_payload_6hqj1_fs/ansible_azure_rm_resourcegroup_payload.zip/ansible/module_utils/azure_rm_common.py\", line 1095, in __init__\n File \"/usr/local/lib/python3.7/site-packages/msrestazure/azure_active_directory.py\", line 354, in __init__\n self.set_token()\n File \"/usr/local/lib/python3.7/site-packages/msrestazure/azure_active_directory.py\", line 370, in set_token\n raise_with_traceback(AuthenticationError, \"\", err)\n File \"/usr/local/lib/python3.7/site-packages/msrest/exceptions.py\", line 54, in raise_with_traceback\n raise error.with_traceback(exc_traceback)\n File \"/usr/local/lib/python3.7/site-packages/msrestazure/azure_active_directory.py\", line 366, in set_token\n self.secret\n File \"/usr/local/lib/python3.7/site-packages/adal/authentication_context.py\", line 179, in acquire_token_with_client_credentials\n return self._acquire_token(token_func)\n File \"/usr/local/lib/python3.7/site-packages/adal/authentication_context.py\", line 128, in _acquire_token\n return token_func(self)\n File \"/usr/local/lib/python3.7/site-packages/adal/authentication_context.py\", line 177, in token_func\n return token_request.get_token_with_client_credentials(client_secret)\n File \"/usr/local/lib/python3.7/site-packages/adal/token_request.py\", line 310, in get_token_with_client_credentials\n token = self._oauth_get_token(oauth_parameters)\n File \"/usr/local/lib/python3.7/site-packages/adal/token_request.py\", line 112, in _oauth_get_token\n return client.get_token(oauth_parameters)\n File \"/usr/local/lib/python3.7/site-packages/adal/oauth2_client.py\", line 289, in get_token\n raise AdalError(return_error_string, error_response)\nmsrest.exceptions.AuthenticationError: , AdalError: Get Token request returned http error: 400 and server response: Bad Request\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
I'm expected the ansible playbook command to run and deploy to Azure. However this authentication error is stopping the process.
By the way, here's my ansible playbook file(sanitised):
- name: Create Azure Kubernetes Service
hosts: localhost
connection: local
vars:
resource_group: pipeline-in-a-box
location: uksouth
aks_name: pipeline-in-a-box-cluster
username: "devOpsBot"
ssh_key: "My public SSH key"
client_id: "service principle id"
client_secret: "service principle password"
kubernetes_version: "1.14.6"
tasks:
- name: Create resource group
azure_rm_resourcegroup:
name: "{{ resource_group }}"
location: "{{ location }}"
- name: Create a managed Azure Container Services (AKS) cluster
azure_rm_aks:
name: "{{ aks_name }}"
location: "{{ location }}"
resource_group: "{{ resource_group }}"
dns_prefix: "{{ aks_name }}"
kubernetes_version: "{{ kubernetes_version }}"
linux_profile:
admin_username: "{{ username }}"
ssh_key: "{{ ssh_key }}"
service_principal:
client_id: "{{ client_id }}"
client_secret: "{{ client_secret }}"
agent_pool_profiles:
- name: default
count: 2
vm_size: Standard_D2_v2
tags:
Environment: Test
- name: Create Azure Storage Account
azure_rm_storageaccount:
resource_group: "{{ resource_group }}"
name: piabstorage
type: Standard_RAGRS
tags:
testing: testing
delete: on-exit
- name: Create managed disk
azure_rm_manageddisk:
name: piabdisk
location: uksouth
resource_group: "{{ resource_group }}"
disk_size_gb: 1
- name: Create an azure container registry
azure_rm_containerregistry:
name: piabregistry
location: "{{ location }}"
resource_group: "{{ resource_group }}"
admin_user_enabled: True
sku: Basic
register: acr_result
- name: Push docker image to comtainer registry
docker_image:
name: atlassian/confluence-server
repository: piabregistry.azurecr.io
push: yes
source: pull
- name: Create Azure Container Instance
azure_rm_containerinstance:
resource_group: "{{ resource_group }}"
name: piabcontainer
ip_address: public
ports:
- "8090"
- "8091"
registry_login_server: piabregistry.azurecr.io
registry_username: piabregistry
registry_password: "{{ acr_result.credentials.password }}"
containers:
- name: confluence-server
ports:
- "8090"
- "8091"
image: atlassian/confluence-server
- name: Get details of the AKS
azure_rm_aks_facts:
name: aksfacts
resource_group: "{{ resource_group }}"
show_kubeconfig: user
- name: Show AKS cluster detail
debug:
var: output.aks[0]
```

Error while executing playbook to start ec2 instance

I have a playbook that starts an ec2 instance. But I keep getting the following error when I run it.
---
- name: Create an ec2 instance
hosts: localhost
gather_facts: false
vars:
region: us-east-1
instance_type: t2.micro
ami: ami-01ac7d9c1179d7b74
keypair: priyajdm
tasks:
- name: Create an ec2 instance
ec2:
key_name: "{{ keypair }}"
group: launch-wizard-31
instance_type: "{{ instance_type }}"
image: "{{ ami }}"
wait: true
region: "{{ region }}"
count: 1
vpc_subnet_id: subnet-02f498e16fd56c277
assign_public_ip: yes
register: ec2
Error:
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: AuthFailureAWS was not able to validate the provided access credentialscb70bd1a-b7ec-41aa-895a-fabf9e0b6cfe
fatal: [localhost]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/root/.ansible/tmp/ansible-tmp-1553599571.08-120541135236553/AnsiballZ_ec2.py\", line 113, in \n _ansiballz_main()\n File \"/root/.ansible/tmp/ansible-tmp-1553599571.08-120541135236553/AnsiballZ_ec2.py\", line 105, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/root/.ansible/tmp/ansible-tmp-1553599571.08-120541135236553/AnsiballZ_ec2.py\", line 48, in invoke_module\n imp.load_module('main', mod, module, MOD_DESC)\n File \"/tmp/ansible_ec2_payload_hXpgWw/main.py\", line 1702, in \n File \"/tmp/ansible_ec2_payload_hXpgWw/main.py\", line 1686, in main\n File \"/tmp/ansible_ec2_payload_hXpgWw/main.py\", line 989, in create_instances\n File \"/home/ubuntu/.local/lib/python2.7/site-packages/boto/vpc/init.py\", line 1152, in get_all_subnets\n return self.get_list('DescribeSubnets', params, [('item', Subnet)])\n File \"/home/ubuntu/.local/lib/python2.7/site-packages/boto/connection.py\", line 1186, in get_list\n raise self.ResponseError(response.status, response.reason, body)\nboto.exception.EC2ResponseError: EC2ResponseError: 401 Unauthorized\n\nAuthFailureAWS was not able to validate the provided access credentialscb70bd1a-b7ec-41aa-895a-fabf9e0b6cfe\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
Where do you keep your AWS credentials? You have 3 option:
Use boto configuration file
Set needed enviroment varaibles
Set EC2 module AWS credentials parameters in your task
Last options is the worst practice but easiest way to solve authentication problems.
Check the EC2 Module’s notes.