"Error: unknown shorthand flag: 'n' in -nstances" when trying to connect Google Cloud Proxy to Postgresql (Django) - django

I'm following a google tutorial to set up Django on Cloud Run with Postgresql connected via Google Cloud Proxy. However I keep hitting an error on this command in the Google Cloud Shell.
cloud shell input:
xyz#cloudshell:~ (project-xyz)$ ./cloud-sql-proxy -instances="amz-reporting-files-21:us-west1-c:api-20230212"=tcp:5432
returns:
Error: unknown shorthand flag: 'n' in -nstances=amz-reporting-files-21:us-west1-c:Iamz-ads-api-20230212=tcp:5432
Usage:
cloud-sql-proxy INSTANCE_CONNECTION_NAME... [flags]
Flags:
-a, --address string () Address to bind Cloud SQL instance listeners. (default "127.0.0.1")
--admin-port string Port for localhost-only admin server (default "9091")
-i, --auto-iam-authn () Enables Automatic IAM Authentication for all instances
-c, --credentials-file string Use service account key file as a source of IAM credentials.
--debug Enable the admin server on localhost
--disable-metrics Disable Cloud Monitoring integration (used with --telemetry-project)
--disable-traces Disable Cloud Trace integration (used with --telemetry-project)
--fuse string Mount a directory at the path using FUSE to access Cloud SQL instances.
--fuse-tmp-dir string Temp dir for Unix sockets created with FUSE (default "/tmp/csql-tmp")
-g, --gcloud-auth Use gcloud's user credentials as a source of IAM credentials.
--health-check Enables health check endpoints /startup, /liveness, and /readiness on localhost.
-h, --help Display help information for cloud-sql-proxy
--http-address string Address for Prometheus and health check server (default "localhost")
--http-port string Port for Prometheus and health check server (default "9090")
--impersonate-service-account string Comma separated list of service accounts to impersonate. Last value
is the target account.
-j, --json-credentials string Use service account key JSON as a source of IAM credentials.
--max-connections uint Limit the number of connections. Default is no limit.
--max-sigterm-delay duration Maximum number of seconds to wait for connections to close after receiving a TERM signal.
-p, --port int () Initial port for listeners. Subsequent listeners increment from this value.
--private-ip () Connect to the private ip address for all instances
--prometheus Enable Prometheus HTTP endpoint /metrics on localhost
--prometheus-namespace string Use the provided Prometheus namespace for metrics
--quiet Log error messages only
--quota-project string Specifies the project to use for Cloud SQL Admin API quota tracking.
The IAM principal must have the "serviceusage.services.use" permission
for the given project. See https://cloud.google.com/service-usage/docs/overview and
https://cloud.google.com/storage/docs/requester-pays
--sqladmin-api-endpoint string API endpoint for all Cloud SQL Admin API requests. (default: https://sqladmin.googleapis.com)
-l, --structured-logs Enable structured logging with LogEntry format
--telemetry-prefix string Prefix for Cloud Monitoring metrics.
--telemetry-project string Enable Cloud Monitoring and Cloud Trace with the provided project ID.
--telemetry-sample-rate int Set the Cloud Trace sample rate. A smaller number means more traces. (default 10000)
-t, --token string Use bearer token as a source of IAM credentials.
-u, --unix-socket string (*) Enables Unix sockets for all listeners with the provided directory.
--user-agent string Space separated list of additional user agents, e.g. cloud-sql-proxy-operator/0.0.1
-v, --version Print the cloud-sql-proxy version
While my input is "-instances" the error message returns "-nstances" as if it's either truncating somehow, or as if it's matching my input to the "-i" flag inadvertently.
I've tried shortening my project name to avoid truncating, and tried inputting the command inside a yaml file instead of running it in google cloud shell.

Looks like -instances is not a valid flag for Cloud SQL Proxy tool and hence the error.
Remove that flag, something like below should work.
./cloud-sql-proxy amz-reporting-files-21:us-west1-c:api-20230212 -p 5432
Please refer to the supported flags here.
This is using the latest cloud-sql-proxy version 2.0.0.

Related

Chaincode (invoke) is not able to endorse on remote cluster with all three orgs, org1 succeeds but org2 and org3 don't. What could be wrong?

I have a Kubernetes cluster configured which builds perfectly when running via Docker Desktop, including invoking with successful endorsement via all three Chaincode containers in the network.
On the remote side, I'm using AWS EKS to deploy my nodes and I have more recently followed this guide on deploying a production ready peer. I already had EFS set up and in use as a k8s Persistent Volume, and this is populated each time I spool up a network with all the config. This means all the crypto materials, connection profiles, etc are mounted to the relevant containers and as per best practice the reference to these TLS certs is in this directory.
This all works as expected... my admin pods can communicate with my peers, the orderers connect, etcetera. I'm able to fully install chaincode, approve it and commit it to all three of my peers successfully.
When it comes to invoking the chaincode, my org1 container always succeeds, and successfully communicates with the peer in its organization.
I'm aware of the core.yaml setting localMspId and this is being overridden by the environment variable CORE_PEER_LOCALMSPID for each set of peers, such that in my org1 peer the value is Org1MSP, in org2 it's Org2MSP, etc.
When running peer chaincode invoke, the first container (org1) succeeds very quickly, the other two try to contact their peers and hang for the timeout period set in the default gRPC settings (110000ms wait). I also have set the env var of CORE_PEER_ADDRESS_AUTODETECT: "true" on my peer in order to ensure it doesn't try to resolve using the hostnames like peer0.org1 (this clearly works for org1 but not the other two).
The environment variables set for TLS in each of the containers corresponds to the contents of the ones I am passing (in correct order) with my invoke command:
peer chaincode invoke --ctor '${CC_INIT_ARGS}' --channelID ${CHANNEL_ID} --name ${CC_NAME} --cafile \$ORDERER_TLS_ROOTCERT_FILE \
--tls true -o orderer.${ORG}:7050 \
--peerAddresses peer0.org1:7051 \
--peerAddresses peer0.org2:7051 \
--peerAddresses peer0.org3:7051 \
--tlsRootCertFiles /etc/hyperledger/fabric-peer/client-root-tlscas/tlsca.org1-cert.pem \
--tlsRootCertFiles /etc/hyperledger/fabric-peer/client-root-tlscas/tlsca.org2-cert.pem \
--tlsRootCertFiles /etc/hyperledger/fabric-peer/client-root-tlscas/tlsca.org3-cert.pem >&invoke-log.txt
cat invoke-log.txt
That command is executed inside my container, and as mentioned, I have manually confirmed by inspecting all three containers, then cating the contents of the files, versus doing the same with the above paths, and they match exactly. That is to say the contents of /etc/hyperledger/fabric-peer/client-root-tlscas/tlsca.org1-cert.pem are equivalent to the CORE_PEER_TLS_ROOTCERT_FILE setting in org1, and so on per organization.
Example org1 chaincode container logs:
2022-02-23T13:47:07.255Z debug [c-api:lib/handler.js] [allorgs-5e707801] Calling chaincode Invoke(), response status: 200
2022-02-23T13:47:07.256Z info [c-api:lib/handler.js] [allorgs-5e707801] Calling chaincode Invoke() succeeded. Sending COMPLETED message back to peer
For org2 and org3 containers, once it finally finishes the timeout, it outputs:
2022-02-23T12:24:05.045Z error [c-api:lib/handler.js] Chat stream with peer - on error: %j "Error: 14 UNAVAILABLE: No connection established\n at Object.callErrorFromStatus (/usr/local/src/node_modules/#grpc/grpc-js/build/src/call.js:31:26)\n at Object.onReceiveStatus (/usr/local/src/node_modules/#grpc/grpc-js/build/src/client.js:391:49)\n at Object.onReceiveStatus (/usr/local/src/node_modules/#grpc/grpc-js/build/src/client-interceptors.js:328:181)\n at /usr/local/src/node_modules/#grpc/grpc-js/build/src/call-stream.js:182:78\n at processTicksAndRejections (internal/process/task_queues.js:79:11)"
2022-02-23T12:24:05.045Z debug [c-api:lib/handler.js] Chat stream ending
I have also enabled DEBUG logs on everything and I'm gleaning nothing useful from it. Any help or suggestions would be greatly appreciated!
The three peers share the same port. Is that even possible?
Also, when running invoke from the command line, I would normally use the following pattern, repeated for each peer.
--peerAddresses localhost:6051 --tlsRootCertFiles <path to peer on port 6051>
--peerAddresses localhost:6052 --tlsRootCertFiles <path to peer on port 6052>
not the three peers followed by the three TLS cert file paths.

"Kafka Timed out waiting for a node assignment." on MSK

Specs:
The serverless Amazon MSK that's in preview.
t2.xlarge EC2 instance with Amazon Linux 2
Installed Kafka from https://dlcdn.apache.org/kafka/3.0.0/kafka_2.13-3.0.0.tgz
openjdk version "11.0.13" 2021-10-19 LTS
OpenJDK Runtime Environment 18.9 (build 11.0.13+8-LTS)
OpenJDK 64-Bit Server VM 18.9 (build 11.0.13+8-LTS, mixed mode,
sharing)
Gradle 7.3.3
https://github.com/aws/aws-msk-iam-auth, successfully built.
I also tried adding IAM authentication information, as recommended by the Amazon MSK Library for AWS Identity and Access Management. It says to add the following in config/client.properties:
# Sets up TLS for encryption and SASL for authN.
security.protocol = SASL_SSL
# Identifies the SASL mechanism to use.
sasl.mechanism = AWS_MSK_IAM
# Binds SASL client implementation.
# sasl.jaas.config = software.amazon.msk.auth.iam.IAMLoginModule required;
# Encapsulates constructing a SigV4 signature based on extracted credentials.
# The SASL client bound by "sasl.jaas.config" invokes this class.
sasl.client.callback.handler.class = software.amazon.msk.auth.iam.IAMClientCallbackHandler
# Binds SASL client implementation. Uses the specified profile name to look for credentials.
sasl.jaas.config = software.amazon.msk.auth.iam.IAMLoginModule required awsProfileName="kafka-client";
And kafka-client is the IAM role attached to the EC2 instance as an instance profile.
Networking: I used VPC Reachability Analyzer to confirm that the security groups are configured correctly and the EC2 instance I'm using as a Producer can reach the serverless MSK cluster.
What I'm trying to do: create a topic.
How I'm trying: bin/kafka-topics.sh --create --partitions 1 --replication-factor 1 --topic quickstart-events --bootstrap-server boot-zclcyva3.c2.kafka-serverless.us-east-2.amazonaws.com:9098
Result:
Error while executing topic command : Timed out waiting for a node assignment. Call: createTopics
[2022-01-17 01:46:59,753] ERROR org.apache.kafka.common.errors.TimeoutException: Timed out waiting for a node assignment. Call: createTopics
(kafka.admin.TopicCommand$)
I'm also trying: with the plaintext port of 9092. (9098 is the IAM-authentication port in MSK, and serverless MSK uses IAM authentication by default.)
All the other posts I found on SO about this node assignment error didn't include MSK. I tried suggestions like uncommenting the listener setting in server.properties, but that didn't change anything.
Installing kcat for troubleshooting didn't work for me, since there's no out-of-the box installation for the yum package manager, which Amazon Linux 2 uses, and since these instructions failed for me at checking for libcurl (by compile)... failed (fail).
The Question: Any other tips on solving this "node assignment" error?
The documentation has been updated recently, I was able to follow it end to end without any issue (The IAM policy is now correct)
https://docs.aws.amazon.com/msk/latest/developerguide/serverless-getting-started.html
The created properties file is not automatically used; your command needs to include --command-config client.properties, where this properties file is documented at the MSK docs on the linked IAM page.
Extract...
ssl.truststore.location=<PATH_TO_TRUST_STORE_FILE>
security.protocol=SASL_SSL
sasl.mechanism=AWS_MSK_IAM
sasl.jaas.config=software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class=software.amazon.msk.auth.iam.IAMClientCallbackHandler
Alternatively, if the plaintext port didn't work, then you have other networking issues
Beyond these steps, I suggest reaching out to MSK support, and telling them to update the "Create a Topic" page to no longer use Zookeeper, keeping in mind that Kafka 3.0 is not (yet) supported

AWS Airflow v2.0.2 doesn't show Google Cloud connection type

I want to load data from Google Storage to S3
To do this I want to use GoogleCloudStorageToS3Operator, which requires gcp_conn_id
So, I need to set up Google Cloud connection type
To do this, I added
apache-airflow[google]==2.0.2
to requirements.txt
but Google Cloud connection type is still not in Dropdown list of connections in MWAA
Same approach works well with mwaa local runner
https://github.com/aws/aws-mwaa-local-runner
I guess it does not work in MWAA because of security reasons discussed here
https://lists.apache.org/thread.html/r67dca5845c48cec4c0b3c34c3584f7c759a0b010172b94d75b3188a3%40%3Cdev.airflow.apache.org%3E
But still, is there any workaround to add Google Cloud connection type in MWAA?
Connections can be created and managed using either the UI or environment variables.
To my understanding the limitation that MWAA have over installation of some provider packages are only on the web server machine which is why the connections are not listed on the UI. This doesn't mean you can't create the connection at all, it just means you can't do it from the UI.
You can define it from CLI:
airflow connections add [-h] [--conn-description CONN_DESCRIPTION]
[--conn-extra CONN_EXTRA] [--conn-host CONN_HOST]
[--conn-login CONN_LOGIN]
[--conn-password CONN_PASSWORD]
[--conn-port CONN_PORT] [--conn-schema CONN_SCHEMA]
[--conn-type CONN_TYPE] [--conn-uri CONN_URI]
conn_id
You can also generate a connection URI to make it easier to set.
Connections can also be set as environment variable. Example:
export AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='google-cloud-platform://?extra__google_cloud_platform__key_path=%2Fkeys%2Fkey.json&extra__google_cloud_platform__scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform&extra__google_cloud_platform__project=airflow&extra__google_cloud_platform__num_retries=5'
If needed you can check the google provider package docs to review the configuration options of the connection.
For MWAA there are 2 options to set connection:
Setting environment variable.
Using pattern AIRFLOW_CONN_YOUR_CONNECTION_NAME,
where e.g. YOUR_CONNECTION_NAME = GOOGLE_CLOUD_DEFAULT.
That can be done using custom plugin
https://docs.aws.amazon.com/mwaa/latest/userguide/samples-env-variables.html
Using secret manager
https://docs.aws.amazon.com/mwaa/latest/userguide/connections-secrets-manager.html
Tested for google cloud connection, both are working.
I asked AWS support about this issue. Looks like they are working on it.
They told me a way to configure the the google cloud platform connection passing a json object in the extras with Conn Type as HTTP. And it works.
I have validated editing google_cloud_default (Airflow > Admin > Connections)
Conn Type: HTTP
Extra:
{
"extra__google_cloud_platform__project":"<YOUR_VALUE>",
"extra__google_cloud_platform__key_path":"",
"extra__google_cloud_platform__keyfile_dict":"{"type": "service_account","project_id": "<YOUR_VALUE>","private_key_id": "<YOUR_VALUE>", "private_key": "-----BEGIN PRIVATE KEY-----\n<YOUR_VALUE>\n-----END PRIVATE KEY-----\n", "client_email": "<YOUR_VALUE>", "client_id": "<YOUR_VALUE>", "auth_uri": "https://<YOUR_VALUE>", "token_uri": "https://<YOUR_VALUE>", "auth_provider_x509_cert_url": "https://<YOUR_VALUE>", "client_x509_cert_url": "https://<YOUR_VALUE>"}",
"extra__google_cloud_platform__scope":"",
"extra__google_cloud_platform__num_retries":"5"
}
airflow conn screenshot
!! You must escape the " and /n in extra__google_cloud_platform__keyfile_dict !!
In requirements.txt I used:
apache-airflow[gcp]==2.0.2
(I believe apache-airflow[google]==2.0.2 should work as well)

How to schedule task to call gRPC method?

I have .Net server running in Google Kubernetes Engine. It is configured to use gRPC through Google Cloud Endpoints. Now I need to schedule task to call my gRPC method once per day.
The first thing I tried was to use Google Cloud Scheduler to call http methods directly. For that I have:
Set up HTTP to gRPC transcoding on my server to call my gRPC method through http.
Created and enabled SSL certificate as described here.
Created service account in IAM & admin console with Service Account Token Creator and Service Account User permissions.
Created Cloud Scheduler job with my url and Auth header as OIDC token and created above service account.
Deployed Google Cloud Endpoints configuration with following parameters (not only them):
authentication:
providers:
- id: google_service_account
issuer: MY_SERVICE_ACCOUNT_EMAIL
jwks_uri: https://www.googleapis.com/robot/v1/metadata/x509/MY_SERVICE_ACCOUNT_EMAIL
rules:
- selector: "*"
requirements:
- provider_id: google_service_account
After that when I run scheduler job it returns result "Failed". In logs it writes ERROR with status UNKNOWN.
The second thing I tried was to use Google Cloud Scheduler to publish message in Pub Sub topic with my server as subscriber.
Unsuccesfully too because I can't verify ownership of Google Cloud Endpoints domain. I asked regarding question here: How to verify ownership of Google Cloud Endpoints service URL?
Now the question: what is the best way to schedule task that would call gRPC method assuming following environment:
.Net server running on GKE
gRPC
Automated periodical call of that task (I can call manually but it's meaningless)
So you were able to make a HTTP call manually, but not automatically by Google Cloud Scheduler, is that correct?
If so, check to see if the request reach the Cloud Endpoint Proxy in the cloud console Endpoint Logging, it may give you some hints.
Distributed scheduler
more details refer sourcedcode Distributed scheduler
This application can be run on different hosts and offers functionality to
schedule execution of arbitrary command at particular time or periodically.
There are two ways to communicate with application: gRPC and REST. Remote
interfaces are
specified in dsched.proto file
Corresponding REST API could be also found over there in form of API
annotations. We also provide generated Swagger files.
To specify task execution timing, we are using notation adopted by cron.
Scheduled tasks are stored in file and loaded automatically during startup.
Building
Install gRPC
Install gRPC gateway
To parse crontab statements and schedule task execution, we are using gopkg.in/robfig/cron.v2 library.
So it should be installed also: go get -u gopkg.in/robfig/cron.v2. Documentation could be found here
Get dsched package: go get
-u gitlab.com/andreynech/dsched
Now it is possible to run standard go build command in dscheduler and
gateway directories to generate binaries for scheduler and REST/JSON API
gateway. It might be also helpful to examine our
CI configuration file to see how we
set up building environment.
Running
All the scheduling functionality is implemented by dscheduler executable. So
it could be run on system startup or on demand. As described by dscheduler --help,
there are two command line parameters:
-i string - File name to store task list (default "/var/run/dscheduler.db")
-p string - Endpoint to listen (default ":50051")
If there is a need to offer REST/JSON API, gateway application located in
gateway directory should be run. It could reside on the same host as
dscheduler, but typically it would be other host which is accessible over
HTTP from outside and at the same way can talk to dscheduler running in
internal network. This setup was also the reason to split scheduler and
gateway in two executables. gateway is mostly generated application and
supports several command-line parameters described by running gateway --help.
Important parameter is -sched_endpoint string which is endpoint of Scheduler
service (default "localhost:50051"). It specifies the host name and port
where dscheduler is listening for requests.
Scheduling tasks (testing)
There are three ways to control scheduler server:
Using Go client implemented in cli/ directory
Using Python client implemented in py_cli directory
Using REST/JSON API gateway and curl
Go and Python clients have similar set of command line parameters.
$ ./cli --help
Usage of cli:
-a string
The command to execute at time specified by -c parameter
-c string
Statement in crontab format describes when to execute the command
-e string
Host:port to connect (default "localhost:50051")
-l List scheduled tasks
-p Purge all scheduled tasks
-r int
Remove the task with specified id from schedule
-s Schedule task. -c and -a arguments are required in this case
They are using gRPC protocol to talk to scheduler server. Here are several
example invocations:
$ ./cli -l list currently scheduled tasks
$ ./cli -s -c "#every 0h00m10s" -a "df" schedule df command for
execution every 10 seconds
$ ./cli -s -c "0 30 * * * *" -a "ls -l" schedule ls -l command to
run every 30 minutes
$ ./cli -r 3 remove task with ID 3
$ ./cli -p remove all scheduled tasks
It is also possible to use curl to invoke dscheduler functionality over
REST/JSON API gateway. Assuming that dscheduler and gateway applications
are running, here are some invocations to list, add and remove scheduling
entries from the same host (localhost):
curl 'http://localhost:8080/v1/scheduler/list' list currently scheduled tasks
curl -d '{"id":0, "cron":"#every 0h00m10s", "action":"ls"}' -X POST 'http://localhost:8080/v1/scheduler/add' schedule ls command for execution every 10 seconds
curl -d '{"id":0, "cron":"0 30 * * * *", "action":"ls -l"}' -X POST 'http://localhost:8080/v1/scheduler/add' schedule ls -l to run every 30 minutes
curl -d '{"id":2}' -X POST 'http://localhost:8080/v1/scheduler/remove' remove task with ID 2.
curl -X POST 'http://localhost:8080/v1/scheduler/removeall' remove all scheduled tasks
All changes are automatically saved in file.
Thoughts on scheduler service discovery
In large deployment scenarios (like hundreds of hosts) it might be
challenging problem to find out all IP addresses and ports where scheduler
service is started. It would be pretty easy to add support for Zeroconf
(Bonjour/Avahi) technology to simplify service discovery. As alternative, it
might be possible to implement something similar to CORBA Naming Service
where running services register themself and location of naming service is
well known. We decide to collect feedback before deciding for particular
service discovery implementation. So your input very welcome!

Enabling HA namenodes on a secure cluster in Cloudera Manager fails

I am running a CDH4.1.2 secure cluster and it works fine with the single namenode+secondarynamenode configuration, but when I try to enable High Availability (quorum based) from the Cloudera Manager interface it dies at step 10 of 16, "Starting the NameNode that will be transitioned to active mode namenode ([my namenode's hostname])".
Digging into the role log file gives the following fatal error:
Exception in namenode joinjava.lang.IllegalArgumentException: Does not contain a valid host:port authority: [my namenode's fqhn]:[my namenode's fqhn]:0 at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:206) at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:158) at
org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:147) at
org.apache.hadoop.hdfs.server.namenode.NameNodeHttpServer.start(NameNodeHttpServer.java:143) at
org.apache.hadoop.hdfs.server.namenode.NameNode.startHttpServer(NameNode.java:547) at
org.apache.hadoop.hdfs.server.namenode.NameNode.startCommonServices(NameNode.java:480) at
org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:443) at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:608) at
org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:589) at
org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1140) at
org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1204)
How can I resolve this?
It looks like you have two problems:
The NameNode's IP address is resolving to "my namenode's fqhn" instead of a regular hostname. Check your /etc/hosts file to fix this.
You need to configure dfs.https.port. With Cloudera Manager free edition, you must have had to add the appropriate configs to the safety valves to enable security. As part of that, you need to configure the dfs.https.port.
Given that this code path is traversed even in the non-HA mode, I'm surprised that you were able to get your secure NameNode to start up correctly before enabling HA. In case you haven't already, I recommend that you first enable security, test that all HDFS roles start up correctly and then enable HA.