Cloud Founry Loggregator failed to establish a connection - cloud-foundry

Good Morning,
I have a problem with cf doppel. Using cf logs app --recent I got an error
cf logs app FAILED Error dialing trafficcontroller server: read tcp
10.0.0.6:45719->139.25.25.233:4443: i/o timeout. Please ask your Cloud Foundry Operator to check the platform configuration
(trafficcontroller is wss://doppler.de.cloudlab.com:4443).
the same problem with cf push.
We are using CF 239 and CF CLI 6.22.1.
The doppler config is:
- name: doppler instances: 1 vm_type: medium azs: [INDIA] stemcell: ubuntu-trusty templates:
- {name: doppler, release: cf}
- {name: metron_agent, release: cf}
- {name: syslog_drain_binder, release: cf} networks:
- name: private properties:
doppler_endpoint:
shared_secret: password
- name: loggregator_trafficcontroller instances: 1 vm_type: medium azs: [INDIA] stemcell: ubuntu-trusty templates:
- {name: loggregator_trafficcontroller, release: cf}
- {name: metron_agent, release: cf}
- {name: route_registrar, release: cf} networks:
- name: private properties:
route_registrar:
routes:
- name: doppler
registration_interval: 20s
port: 8081
uris:
- "doppler.<%= system_domain %>"
- name: loggregator
registration_interval: 20s
port: 8080
uris:
- "loggregator.<%= system_domain %>"
nc is able to establish a connection to the router.
nc -vz 139.25.25.233 4443
Connection to 139.25.25.233 4443 port [tcp/*] succeeded!
any ideas?
Update:
some more information
$ cf curl /v2/info
{
"name": "",
"build": "",
"support": "",
"version": 0,
"description": "CloudFoundry IN",
"authorization_endpoint": "https://login.de.cloudlab.com",
"token_endpoint": "https://uaa.de.cloudlab.com",
"min_cli_version": null,
"min_recommended_cli_version": null,
"api_version": "2.57.0",
"app_ssh_endpoint": "ssh.de.cloudlab.com:2222",
"app_ssh_host_key_fingerprint": null,
"app_ssh_oauth_client": "ssh-proxy",
"logging_endpoint": "wss://loggregator.de.cloudlab.com:4443",
"doppler_logging_endpoint": "wss://doppler.de.cloudlab.com:443"
}

Cloud Foundry CLI is upgraded to 6.22.2, due to that you might face issues with existing CF CLI command prompt that you have installed in your box for pushing the app. please try installing the latest CLI from the below location and verify once.
https://github.com/cloudfoundry/cli/releases

It looks like your Cloud Foundry operator changed the Doppler endpoint port from 4443 to 443.
The cf CLI caches the doppler endpoint url locally, and is still pointing at the old port.
Use cf api or cf login -a to set your endpoint again; that will refresh the locally cached data.

Finally, I found it. I was a problem with the connection to the DNS Server. After fixing that it works. DNS was not answering.

Related

Data Prepper Pipelines + OpenSearch Trace Analytics

I'm using the latest version of AWS OpenSearch but somehow, when I'm trying to go to the Trace analytics Dashboard it does not show the traces sent by the Data Prepper.
Manual OpenTelemetry instrumented application
Data Prepper is running in a Docker (opensearchproject/data-prepper:latest)
OpenSearch is running on the latest version
Sample Configuration
data-prepper-config.yaml
ssl: false
pipelines.yaml
entry-pipeline:
delay: "100"
source:
otel_trace_source:
ssl: false
sink:
- pipeline:
name: "raw-pipeline"
- pipeline:
name: "service-map-pipeline"
raw-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
processor:
- otel_trace_raw:
sink:
- opensearch:
hosts: [ "https://opensearch-domain" ]
username: "admin"
password: "admin"
index_type: trace-analytics-raw
service-map-pipeline:
delay: "100"
source:
pipeline:
name: "entry-pipeline"
processor:
- service_map_stateful:
sink:
- opensearch:
hosts: ["https://opensearch-domain"]
username: "admin"
password: "admin"
index_type: trace-analytics-service-map
remote-collector.yaml
...
exporters:
otlp/data-prepper:
endpoint: data-prepper-address:21890
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/data-prepper]
When I try to go to the Query Workbench and run the query SELECT * FROM otel-v1-apm-span, I'm getting the list of received trace spans. But I'm unable to see a chart or something on the Trace Analytics Dashboard (both Traces and Services). It's just an empty dashboard.
I'm also getting a warning:
WARN org.opensearch.dataprepper.plugins.processor.oteltrace.OTelTraceRawProcessor - Missing trace group for SpanId: xxxxxxxxxxxx
The traceGroupFields are also empty.
"traceGroupFields": {
"endTime": null,
"durationInNanos": null,
"statusCode": null
}
Is there something wrong with my setup? Any help is appreciated.

Redis deployed in AWS - Connection time out from localhost SpringBoot app

Small question regarding Redis deployed in AWS (not AWS Elastic Cache) and an issue connecting to it.
Here is the setup of the Redis deployed in AWS: (pasting only the Kubernetes StatefulSet and Service)
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: redis
spec:
serviceName: redis
replicas: 3
selector:
matchLabels:
app: redis
template:
metadata:
labels:
app: redis
spec:
initContainers:
- name: config
image: redis:7.0.5-alpine
command: [ "sh", "-c" ]
args:
- |
cp /tmp/redis/redis.conf /etc/redis/redis.conf
echo "finding master..."
MASTER_FDQN=`hostname -f | sed -e 's/redis-[0-9]\./redis-0./'`
if [ "$(redis-cli -h sentinel -p 5000 ping)" != "PONG" ]; then
echo "master not found, defaulting to redis-0"
if [ "$(hostname)" = "redis-0" ]; then
echo "this is redis-0, not updating config..."
else
echo "updating redis.conf..."
echo "slaveof $MASTER_FDQN 6379" >> /etc/redis/redis.conf
fi
else
echo "sentinel found, finding master"
MASTER="$(redis-cli -h sentinel -p 5000 sentinel get-master-addr-by-name mymaster | grep -E '(^redis-\d{1,})|([0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3})')"
echo "master found : $MASTER, updating redis.conf"
echo "slaveof $MASTER 6379" >> /etc/redis/redis.conf
fi
volumeMounts:
- name: redis-config
mountPath: /etc/redis/
- name: config
mountPath: /tmp/redis/
containers:
- name: redis
image: redis:7.0.5-alpine
command: ["redis-server"]
args: ["/etc/redis/redis.conf"]
ports:
- containerPort: 6379
name: redis
volumeMounts:
- name: data
mountPath: /data
- name: redis-config
mountPath: /etc/redis/
volumes:
- name: redis-config
emptyDir: {}
- name: config
configMap:
name: redis-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: nfs-1
resources:
requests:
storage: 50Mi
---
apiVersion: v1
kind: Service
metadata:
name: redis
spec:
ports:
- port: 6379
targetPort: 6379
name: redis
selector:
app: redis
type: LoadBalancer
The pods are healthy, I can exec into it and perform operations fine. Here is the get all:
NAME READY STATUS RESTARTS AGE
pod/redis-0 1/1 Running 0 22h
pod/redis-1 1/1 Running 0 22h
pod/redis-2 1/1 Running 0 22h
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/redis LoadBalancer 192.168.45.55 10.51.5.2 6379:30315/TCP 26h
NAME READY AGE
statefulset.apps/redis 3/3 22h
Here is the describe of the service:
Name: redis
Namespace: Namespace
Labels: <none>
Annotations: <none>
Selector: app=redis
Type: LoadBalancer
IP Family Policy: SingleStack
IP Families: IPv4
IP: 192.168.22.33
IPs: 192.168.22.33
LoadBalancer Ingress: 10.51.5.2
Port: redis 6379/TCP
TargetPort: 6379/TCP
NodePort: redis 30315/TCP
Endpoints: 192.xxx:6379,192.xxx:6379,192.xxx:6379
Session Affinity: None
External Traffic Policy: Cluster
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal IPAllocated 68s metallb-controller Assigned IP ["10.51.5.2"]
Normal nodeAssigned 58s (x5 over 66s) metallb-speaker announcing from node "someaddress.com" with protocol "bgp"
Normal nodeAssigned 58s (x5 over 66s) metallb-speaker announcing from node "someaddress.com" with protocol "bgp"
I then try to connect to it, i.e. inserting some data with a very straightforward Spring Boot application. The application has no business logic, just trying to insert data.
Here are the relevant parts:
#Configuration
public class RedisConfiguration {
#Bean
public ReactiveRedisConnectionFactory reactiveRedisConnectionFactory() {
return new LettuceConnectionFactory("10.51.5.2", 30315);
}
#Repository
public class RedisRepository {
private final ReactiveRedisOperations<String, String> reactiveRedisOperations;
public RedisRepository(ReactiveRedisOperations<String, String> reactiveRedisOperations) {
this.reactiveRedisOperations = reactiveRedisOperations;
}
public Mono<RedisPojo> save(RedisPojo redisPojo) {
return reactiveRedisOperations.opsForValue().set(redisPojo.getInput(), redisPojo.getOutput()).map(__ -> redisPojo);
}
Each time I am trying to write the data, I am getting this exception:
2022-12-02T20:20:08.015+08:00 ERROR 1184 --- [ctor-http-nio-3] a.w.r.e.AbstractErrorWebExceptionHandler : [8f16a752-1] 500 Server Error for HTTP POST "/save"
org.springframework.data.redis.RedisConnectionFailureException: Unable to connect to Redis
at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.translateException(LettuceConnectionFactory.java:1602) ~[spring-data-redis-3.0.0.jar:3.0.0]
Suppressed: reactor.core.publisher.FluxOnAssembly$OnAssemblyException:
Error has been observed at the following site(s):
*__checkpoint ⇢ Handler com.redis.controller.RedisController#test(RedisRequest) [DispatcherHandler]
*__checkpoint ⇢ HTTP POST "/save" [ExceptionHandlingWebHandler]
Original Stack Trace:
at org.springframework.data.redis.connection.lettuce.LettuceConnectionFactory$ExceptionTranslatingConnectionProvider.translateException(LettuceConnectionFactory.java:1602) ~[spring-data-redis-3.0.0.jar:3.0.0]
Caused by: io.lettuce.core.RedisConnectionException: Unable to connect to 10.51.5.2/<unresolved>:30315
at io.lettuce.core.RedisConnectionException.create(RedisConnectionException.java:78) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
at io.lettuce.core.RedisConnectionException.create(RedisConnectionException.java:56) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
at io.lettuce.core.AbstractRedisClient.getConnection(AbstractRedisClient.java:350) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
at io.lettuce.core.RedisClient.connect(RedisClient.java:216) ~[lettuce-core-6.2.1.RELEASE.jar:6.2.1.RELEASE]
Caused by: io.netty.channel.ConnectTimeoutException: connection timed out: /10.51.5.2:30315
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe$1.run(AbstractNioChannel.java:261) ~[netty-transport-4.1.85.Final.jar:4.1.85.Final]
at io.netty.util.concurrent.PromiseTask.runTask(PromiseTask.java:98) ~[netty-common-4.1.85.Final.jar:4.1.85.Final]
This is particularly puzzling, because I am quite sure the code of the Spring Boot app is working. When I change the IP of return new LettuceConnectionFactory("10.51.5.2", 30315);: to
a regular Redis on my laptop ("localhost", 6379),
a dockerized Redis on my laptop,
a dockerized Redis on prem, all are working fine.
Therefore, I am quite puzzled what did I do wrong with the setup of this Redis in AWS.
What should I do in order to connect to it properly.
May I get some help please?
Thank you
By default, Redis binds itself to the IP addresses 127.0.0.1 and ::1 and does not accept connections against non-local interfaces. Chances are high that this is your main issue and you may want to review your redis.conf file to bind Redis to the interface you need or to the generic * -::*, as explained in the comments of the config file itself (which I have linked above).
With that being said, Redis also does not accept connections on non-local interfaces if the default user has no password - a security layer named Protected mode. Thus you should either give your default user a password or disable protected mode in your redis.conf file.
Not sure if this applies to your case but, as a side note, I would suggest to always avoid exposing Redis to the Internet.
You are mixing 2 things.
To enable this service for pods in different namespaces you do not need external load balancer, you can just try to use redis.namespace-name:6379 dns name and it will just work. Such dns is there for every service you create (but works only inside kubernetes)
Kubernetes will make sure that your traffic will be routed to proper pods (assuming there is more than one).
If you want to expose redis from outside of kubernetes then you need to make sure there is connectivity from the outside and then you need network load balancer that will forward traffic to your kubernetes service (in your case node port, so you need NLB with eks worker nodes: 30315 as a targets)
If your worker nodes have public IP and their SecurityGroups allow connecting to them directly, you could try to connect to worker node's IP directly just to test things out (without LB).
And regardless off yout setup you can always create proxy via kubectl
kubectl port-forward -n redisNS svc/redis 6379:6379
and connect from spring boot app to localhost:6379
How do you want to connect from app to redis in a final setup?

Promtail End Point Failure At /loki/api/v1/push On EC2 Instance Via Docker

I am using aws Instance and I am trying to run promtail in order to fetch logs and forward it to loki server. Promtail, Loki and Grafana are being run through Docker.
The Loki Server is runing on port 3100, Promtail on 3400 and Loki on 8001. Since It is an AWS platform what needs to be done so that It stops throwing error at http://43.206.43.87:3100/loki/api/v1/push end point.
here is my promtail-config.yaml
server:
http_listen_port: 3400
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://43.206.43.87:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- 43.206.43.87
labels:
job: varlogs
__path__: /var/log/*log
Here is my loki-config.yaml
auth_enabled: false
server:
http_listen_port: 3100
grpc_listen_port: 0
common:
path_prefix: /tmp/loki
storage:
filesystem:
chunks_directory: /tmp/loki/chunks
rules_directory: /tmp/loki/rules
replication_factor: 1
ring:
instance_addr: 43.206.43.87
kvstore:
store: inmemory
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
ruler:
alertmanager_url: http://localhost:9093
Please help me out
All I had to do was change the loki-config.yaml file in that
I had to write
instance_addr: localhost
and everything worked

client disconnected before any response GCLB

We deployed our site in front GCLB.
LB -> Cloud run -> APP ENGINE API
Cloud run is hosting a react site and App Engine golang API.
After 12 hours we started to saw decline in the amount of clicks via google analytics but traffic was pretty much the same.
Our assumption is that "lost" traffic somehow, I can see in logs 2 main issue.
404 with address of old site components.
client disconnected before any response error.
I can understand the 404 error its cache request that looking for old site components.
But i don`t understand client disconnected error and if its related to our "lost" traffic.
Any suggestion how to analyze our "lost" traffic?
UPDATE:
I found some correlation to the client client disconnected error.
The requestUrl contains images resources for exemple
images/zoom.png?v1.0
Back end service name is empty backend_service_name: ""
not sure how it can be empty, I mapped all the resources and host
LOG
{
"insertId": "cs2fmdg2eo8nba",
"jsonPayload": {
"cacheId": "FRA-1209ea83",
"#type": "type.googleapis.com/google.cloud.loadbalancing.type.LoadBalancerLogEntry",
"statusDetails": "client_disconnected_before_any_response"
},
"httpRequest": {
"requestMethod": "GET",
"requestUrl": "https://travelpricedrops.com/images/aero.png?v1.0",
"requestSize": "78",
"userAgent": "Mozilla/5.0 (iPhone; CPU iPhone OS 14_8 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1",
"remoteIp": "109.104.52.1",
"referer": "https://travelpricedrops.com/passthru?tab=front&vert=flights&origin-iata=LEJ&destination-iata=JFK&departure-time=2021-12-26T11%3A00%3A00Z&cabin-class=economy&num-adults=1&num-youth=0&rental-duration=6&dta=48&return-time=2022-01-01T11%3A00%3A00Z&f=cf&fuid=1102&b=k&buid=1043",
"cacheLookup": true,
"latency": "0.071958s"
},
"resource": {
"type": "http_load_balancer",
"labels": {
"zone": "global",
"backend_service_name": "",
"forwarding_rule_name": "tpd-int-https-ipv4",
"target_proxy_name": "int-tpd-target-proxy-2",
"url_map_name": "int-tpd",
"project_id": "tpdrops"
}
},
"timestamp": "2021-11-09T06:13:55.121455Z",
"severity": "INFO",
"logName": "projects/tpdrops/logs/requests",
"trace": "projects/tpdrops/traces/13821ba38ae9e3191381f3f64b0a7b1a",
"receiveTimestamp": "2021-11-09T06:13:55.343086132Z",
"spanId": "a5ae86336a24bc32"
}
Config
**gcloud compute forwarding-rules describe tpd-int-https-ipv4**
IPAddress: 34.149.93.11
IPProtocol: TCP
creationTimestamp: '2021-08-30T11:49:06.047-07:00'
description: ''
fingerprint: CIAg3TcEb9Y=
id: '1815919129513727693'
kind: compute#forwardingRule
labelFingerprint: 42WmSpB8rSM=
loadBalancingScheme: EXTERNAL
name: tpd-int-https-ipv4
networkTier: PREMIUM
portRange: 443-443
selfLink: https://www.googleapis.com/compute/v1/projects/tpdrops/global/forwardingRules/tpd-int-https-ipv4
target: https://www.googleapis.com/compute/v1/projects/tpdrops/global/targetHttpsProxies/int-tpd-target-proxy-2
**gcloud compute backend-services describe tpd-prod-back**
affinityCookieTtlSec: 0
backends:
- balancingMode: UTILIZATION
capacityScaler: 0.0
group: https://www.googleapis.com/compute/v1/projects/tpdrops/regions/us-central1/networkEndpointGroups/tpd-front
cdnPolicy:
cacheKeyPolicy:
includeHost: true
includeProtocol: true
includeQueryString: true
cacheMode: CACHE_ALL_STATIC
clientTtl: 3600
defaultTtl: 3600
maxTtl: 86400
negativeCaching: false
requestCoalescing: true
serveWhileStale: 86400
signedUrlCacheMaxAgeSec: '0'
connectionDraining:
drainingTimeoutSec: 0
creationTimestamp: '2021-10-25T04:09:29.908-07:00'
description: ''
enableCDN: true
fingerprint: 5FNZk6GXJTw=
iap:
enabled: false
id: '6357784085114072710'
kind: compute#backendService
loadBalancingScheme: EXTERNAL
logConfig:
enable: true
sampleRate: 1.0
name: tpd-prod-back
port: 80
portName: http
protocol: HTTP
selfLink: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-prod-back
sessionAffinity: NONE
timeoutSec: 30
**gcloud compute url-maps describe int-tpd**
creationTimestamp: '2021-08-29T06:08:35.918-07:00'
defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-prod-back
fingerprint: trtG9xBMlvE=
hostRules:
- hosts:
- acpt.travelpricedrops.com
pathMatcher: path-matcher-2
- hosts:
- int.travelpricedrops.com
pathMatcher: path-matcher-1
- hosts:
- api.acpt.travelpricedrops.com
pathMatcher: path-matcher-3
- hosts:
- api.int.travelpricedrops.com
pathMatcher: path-matcher-4
- hosts:
- api.travelpricedrops.com
pathMatcher: path-matcher-5
- hosts:
- travelpricedrops.com
pathMatcher: path-matcher-6
id: '6018005644614187068'
kind: compute#urlMap
name: int-tpd
pathMatchers:
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-acpt-back
name: path-matcher-2
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-int-http
name: path-matcher-1
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-api-acpt
name: path-matcher-3
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-api-int
name: path-matcher-4
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-api
name: path-matcher-5
- defaultService: https://www.googleapis.com/compute/v1/projects/tpdrops/global/backendServices/tpd-prod-back
name: path-matcher-6
selfLink: https://www.googleapis.com/compute/v1/projects/tpdrops/global/urlMaps/int-tpd
**gcloud compute target-http-proxies describe int-tpd-target-proxy-2**
ERROR: (gcloud.compute.target-http-proxies.describe) Could not fetch resource:
- The resource 'projects/tpdrops/global/targetHttpProxies/int-tpd-target-proxy-2' was not found
Your load balancer's configuration looks ok; you have a https-ssl-secured frontend on port 443 pointing to a http backend on port 80 which means that SSL is resolved at the load balancer and sent in plain http to your backend.
Error you're getting means (as per documentation) that the client disconnected before load balancer could reply:
client_disconnected_before_any_response - The connection to the client was broken before the load balancer sent any response.
Now to answer your questions.
Since the images are served directly by your app (I didn't see any host-path rules saying otherwise) make sure that application can serve images in time. Set your application response timeout to 10 seconds or more and this should solve the issue. Have a look at this discussion which may be quite usefull for you.
1.1 - there's also a configurable request timeout for Cloud Run services - you can check it by running gcloud run services describe SERVICE_NAME
The backend_service_name: "" string you mentioned may be empty - nothing to worry about - this is an expected behavior.
Additionally have a look at the Backend service timeout Timeouts and retries in external load balancing which may also put some light onto your case.
Lastly - have a look at How to debug failed requests with client_disconnected_before_any_response.

GCP Deployment Manager "ResourceErrorCode":"400" while database user creation

I am experimenting with deployment manager and each time I try to deploy an SQL instance with a DB on it and 2 users; some of the tasks are failing. Most of the time they are the users:
conf.yaml:
resources:
- name: mycloudsql
type: gcp-types/sqladmin-v1beta4:instances
properties:
name: mycloudsql-01
backendType: SECOND_GEN
instanceType: CLOUD_SQL_INSTANCE
databaseVersion: MYSQL_5_7
region: europe-west6
settings:
tier: db-f1-micro
locationPreference:
zone: europe-west6-a
activationPolicy: ALWAYS
dataDiskSizeGb: 10
- name: mydjangodb
type: gcp-types/sqladmin-v1beta4:databases
properties:
name: django-db-01
instance: $(ref.mycloudsql.name)
charset: utf8
- name: sqlroot
type: gcp-types/sqladmin-v1beta4:users
properties:
name: root
host: "%"
instance: $(ref.mycloudsql.name)
password: root
- name: sqluser
type: gcp-types/sqladmin-v1beta4:users
properties:
name: user
instance: $(ref.mycloudsql.name)
password: user
Error:
PS C:\Users\user\Desktop\Python\GCP> gcloud --project=sound-catalyst-263911 deployment-manager deployments create dm-sql-test-11 --config conf.yaml
The fingerprint of the deployment is TZ_wYom9Q64Hno6X0bpv9g==
Waiting for create [operation-1589869946223-5a5fa71623bc9-1912fcb9-bc59aafc]...failed.
ERROR: (gcloud.deployment-manager.deployments.create) Error in Operation [operation-1589869946223-5a5fa71623bc9-1912fcb9-bc59aafc]: errors:
- code: RESOURCE_ERROR
location: /deployments/dm-sql-test-11/resources/sqluser
message: '{"ResourceType":"gcp-types/sqladmin-v1beta4:users","ResourceErrorCode":"400","ResourceErrorMessage":{"code":400,"message":"Precondition
check failed.","status":"FAILED_PRECONDITION","statusMessage":"Bad Request","requestPath":"https://www.googleapis.com/sql/v1beta4/projects/sound-catalyst-263911/instances/mycloudsql-01/users","httpMethod":"POST"}}'
- code: RESOURCE_ERROR
location: /deployments/dm-sql-test-11/resources/sqlroot
message: '{"ResourceType":"gcp-types/sqladmin-v1beta4:users","ResourceErrorCode":"400","ResourceErrorMessage":{"code":400,"message":"Precondition
check failed.","status":"FAILED_PRECONDITION","statusMessage":"Bad Request","requestPath":"https://www.googleapis.com/sql/v1beta4/projects/sound-catalyst-263911/instances/mycloudsql-01/users","httpMethod":"POST"}}'
Console View:
It doesn`t say what that precondition failing is or am I missing something?
It seems the installation of database is not completed by the time the Deployment Manager starts to create users despite the reference notation is used in the YAML code to take care of dependencies. That is why you receive the "FAILED_PRECONDITION" error.
As a workaround you can split the deployment into two parts:
Create a CloudSQL instance and a database;
Create users.
This does not look elegant, but it works.
Alternatively, you can consider using Terraform. Fortunately, Cloud Shell instance is provided with Terraform pre-installed. There are sample Terraform code for Cloud SQL out there, for example this one:
CloudSQL deployment with Terraform