Google compute firewalls disappears later - google-cloud-platform

I trying to create some firewall rules in google compute, everything goes well, but some time later, they are just disappears.
I tried to add rules on default network, and also custom created - in both cases result same.
Tried both: through web UI, and through gcloud tool

If you believe that someone or something is reverting your Firewall changes, you can take multiple approaches to verify that.
inspect Cloud Console Activity logs
same using CLI: gcloud beta logging read "resource.type=gce_firewall_rule"
check GCE Operations section in Cloud Console
check GCE API requests in Cloud Console Logging, using this advanced filter:
resource.type="gce_firewall_rule"
jsonPayload.event_subtype:"compute.firewalls"

Related

GCP | How I can see all the working virtual machines on a project?

I wrote some code to automate the training procedure on our company vm instances.
you probably know that sometimes GCP can't provide you at the current moment with a machine - 'out of resource' exception.
so , I'd like to monitor which of my machines successfully turned on and which not.
if there is some way to show it on Bigquery it will be great.
thanks .
Using the Cloud Monitoring (Stackdriver) functionality is good way for monitoring all you VMs.
Here is a detailed guide to implement Monitoring on a Compute Engine Instance.
Hope you find it useful.
You can use Google cloud's activity logs too:
Activity logging is enabled by default for all Compute Engine
projects.
You can see your project's activity logs through the Logs Viewer in
the Google Cloud Console:
In the Cloud Console, go to the Logging page. Go to the Logging page
When in the Logs Viewer, select and filter your resource type from the
first drop-down list. From the All logs drop-down list, select
compute.googleapis.com/activity_log to see Compute Engine activity
logs.
Here is the Official documentation.

How to connect GCP VM instance using android Cloud Console?

I've just created a VM instance at GCP with os-login feature enabled. I can connect via ssh using web browser console, SDK, but, I can't connect using the android app called Cloud Console.
Is there any extra step while configuring os-login to use the offical google android app?
Thanks.
I believe everything should work out of the box. There certainly isn't mention of any extra documentation on the Cloud Console Mobile App landing page.
I spun up a new instance (europe-west4-b, n1-standard-1, debian-9-stretch-v20190813) and set the metadata to enable-oslogin = TRUE and I managed to connect right away, albeit from an iOS device.
I would suggest trying different networks (WiFi, 3G/4G), double check any suspicious firewall tags on the instance, maybe even try from a different device. If the connection attempt did reach the server, chances are there's at least one log entry mentioning it in /var/log (/var/log/auth.log in Debian is a good place to start).

Kubernetes Engine unable to pull image from non-private / GCR repository

I was happily deploying to Kubernetes Engine for a while, but while working on an integrated cloud container builder pipeline, I started getting into trouble.
I don't know what changed. I can not deploy to kubernetes anymore, even in ways I did before without cloud builder.
The pods rollout process gives an error indicating that it is unable to pull from the registry. Which seems weird because the images exist (I can pull them using cli) and I granted all possibly related permissions to my user and the cloud builder service account.
I get the error ImagePullBackOff and see this in the pod events:
Failed to pull image
"gcr.io/my-project/backend:f4711979-eaab-4de1-afd8-d2e37eaeb988":
rpc error: code = Unknown desc = unauthorized: authentication required
What's going on? Who needs authorization, and for what?
In my case, my cluster didn't have the Storage read permission, which is necessary for GKE to pull an image from GCR.
My cluster didn't have proper permissions because I created the cluster through terraform and didn't include the node_config.oauth_scopes block. When creating a cluster through the console, the Storage read permission is added by default.
The credentials in my project somehow got messed up. I solved the problem by re-initializing a few APIs including Kubernetes Engine, Deployment Manager and Container Builder.
First time I tried this I didn't succeed, because to disable something you have to disable first all the APIs that depend on it. If you do this via the GCloud web UI then you'll likely see a list of services that are not all available for disabling in the UI.
I learned that using the gcloud CLI you can list all APIs of your project and disable everything properly.
Things worked after that.
The reason I knew things were messed up, is because I had a copy of the same things as a production environment, and there these problems did not exist. The development environment had a lot of iterations and messing around with credentials, so somewhere things got corrupted.
These are some examples of useful commands:
gcloud projects get-iam-policy $PROJECT_ID
gcloud services disable container.googleapis.com --verbosity=debug
gcloud services enable container.googleapis.com
More info here, including how to restore service account credentials.

How to integrate on premise logs with GCP stackdriver

I am evaluating stackdriver from GCP for logging across multiple micro services.
Some of these services are deployed on premise and some of them are on AWS/GCP.
Our services are either .NET or nodejs based apps and we are invested in winston for nodejs and nlog in .net.
I was looking # integrating our on-premise nodejs application with stackdriver logging. Looking # https://cloud.google.com/logging/docs/setup/nodejs the documentation it seems that there we need to install the agent for any machine other than the google compute instances. Is this correct?
if we need to install the agent then is there any way where I can test the logging during my development? The development environment is either a windows 10/mac.
There's a new option for ingesting logs (and metrics) with Stackdriver as most of the non-google environment agents look like they are being deprecated. https://cloud.google.com/stackdriver/docs/deprecations/third-party-apps
A Google post on logging on-prem resources with stackdriver and Blue Medora
https://cloud.google.com/solutions/logging-on-premises-resources-with-stackdriver-and-blue-medora
for logs you still need to install an agent on each box to collect the logs, it's a BindPlane agent not a Google agent.
For node.js, you can use the #google-cloud/logging-winston and #google-cloud/logging-bunyan modules from anywhere (on-prem, AWS, GCP, etc.). You will need to provide projectId and auth credentials manually if not running on GCP. Instructions on how to set these up is available in the linked pages.
When running on GCP we figure out the exact environment (App Engine, Compute Engine, etc.) automatically and the logs should up under those resources in the Logging UI. If you are going to use the modules from your development machines, we will report the logs against the 'global' resource by default. You can customize this by passing a specific resource descriptor yourself.
Let us know if you run into any trouble.
I tried setting this up on my local k8s cluster. By following this: https://kubernetes.io/docs/tasks/debug-application-cluster/logging-stackdriver/
But i couldnt get it to work, the fluentd-gcp-v2.0-qhqzt keeps crashing.
Also, the page mentions that there are multiple issues with stackdriver logging if you DONT use it on google GKE. See the screenshot.
I think google is trying to lock you in into GKE.

Exit code non-zero and unable to see output logs

How do I view stdout/stderr output logs for cloud ML? I've tried using gcloud beta logging read and also gcloud beta ml jobs stream-logs and nothing... all I see are the INFO level logs generated by the system i.e. "Tearing down TensorFlow".
Also in the case where I have an error that shows the docker container exited with non zero code. It links me to a GUI page that shows the same stuff as gcloud beta ml jobs stream-logs. Nothing that shows me the actual output from the console of what my job produced...
Help please??
It may be the case that the Cloud ML service account does not have permissions to write to your project's StackDriver Logs, or the Logging API is not enabled on your project.
First check whether the Stackdriver Logging API is enabled for the project by going to the API manager: https://console.cloud.google.com/apis/api/logging.googleapis.com/overview?project=[YOUR-PROJECT-ID]
Then Cloud ML service account should be automatically added as an Editor to the project, and therefore allows it to write to the project logs, but if you have changed your project permissions it may have lost it. If so, check that you've manually given the Cloud ML service account LogWriter permissions.
If you are unsure of the service account used by Cloud ML, this page has instructions on how to find it: https://cloud.google.com/ml/docs/how-tos/using-external-buckets