Issue while creating admin on prem anthos cluster - google-cloud-platform

I am trying to create on-prem anthos admin cluster but seeing this issue.
`0213 20:12:24.462950 124318 timeouts.go:43] OnPremAdminCluster "gke-admin-jcb" is not ready: ready condition does not exist
I0213 20:12:34.455111 124318 clusterdeployer.go:2284] failed to get kubeconfig for admin cluster "gke-admin-jcb: failed to get kubeconfig from kind cluster: secrets "admin" not found
`
`$gcloud config list
[core]
disable_usage_reporting = True
project = jcb-admin-project
Your active configuration is: [default]
`
Thanks in advance if someone can provide any guidance.

Related

How to configure Packer ssh into GCP VM for building image?

I am building GCP image with packer. I created service account of "Compute Instance Admin v1" and "Service Account User". It can successfully create the VM but cannot ssh into the instance to proceed further for the custom image.
Error message
Build 'googlecompute.custom-image' errored after 2 minutes 20 seconds: Packer experienced an authentication error when trying to connect via SSH. This can happen if your username/password are wrong. You may want to double-check your credentials as part of your debugging process. original error: ssh: handshake failed: ssh: unable to authenticate, attempted methods [none publickey], no supported methods remain
build file source code (packer.pkr.hcl)
locals {
project_id = "project-id"
source_image_family = "rocky-linux-8"
source_image_project_id = ["rocky-linux-cloud"]
ssh_username = "packer"
machine_type = "e2-medium"
zone = "us-central1-a"
}
source "googlecompute" "custom-image" {
image_name = "custom-image" # Name of image to be created
image_description = "Custom Image 1" # Description for image to be created
project_id = "${local.project_id}"
source_image_family = "${local.source_image_family}"
source_image_project_id = "${local.source_image_project_id}"
ssh_username = "${local.ssh_username}"
machine_type = "${local.machine_type}"
zone = "${local.zone}"
}
build {
sources = ["source.googlecompute.custom-image"]
#
# Run arbitrary shell script file
#
provisioner "shell" {
execute_command = "sudo su - root -c \"sh {{ .Path }} \""
script = "foo.sh"
}
}
It appears that you are having trouble connecting via SSH to the Packer-created instance for your GCP image. If the username and password are incorrect or if the necessary permissions are not granted, this error message indicates that the authentication process failed. Check to see if the Compute Instance Admin v1 and Service Account User roles have the necessary access rights to resolve this issue. In addition, the project's firewall rules may need to be set up to allow incoming SSH connections on the port you're using. You can refer to the official GCP documentation for more information regarding the configuration of firewall rules. You can also connect to the instance and continue troubleshooting the issue by using the "gcloud compute ssh" command.
Attaching troubleshooting ssh for reference.
The problem is associated with Qwiklab. I was using the lab environment provided by Qwiklab for testing packer and GCP.
Once I deployed the same thing on regular GCP project. The packer ran successfully. it is suggested there may be some constraints in the lab environment of Qwiklab.

Not able to delete cloud composer environment

When trying to delete my cloud composer environment it gets stuck complaining about insufficient permissions. I have deleted the storage bucket, GKE cluster and the deployment according to this post:
Cannot delete Cloud Composer environment
And the service account is the standard compute SA.
DELETE operation on this environment failed 33 minutes ago with the following error message:
Could not configure workload identity: Permission iam.serviceAccounts.getIamPolicy is required to perform this operation on service account projects/-/serviceAccounts/"project-id"-compute#developer.gserviceaccount.com.
Even though I made the compute account a project owner and IAM Security Admin temporarily it does not work.
And I've tried to delete it through the GUI, gcloud CLI and terraform without success. Any advice or things to try out will be appreciated :)
I got help from the google support, and instead of adressing the SA projects/-/serviceAccounts/"project-id"-compute#developer.gserviceaccount.com.
It was apparently the default service agent that has the format of
service-"project-nr"#cloudcomposer-accounts.iam.gserviceaccount.com with the
Cloud Composer v2 API Service Agent Extension
Thank you for the kind replies!
The issue iam.serviceAccounts.getIamPolicy, seems to be more related to the credentials, that your server is having issues retrieving credentials data.
You should set up your path credentials variable again:
export GOOGLE_APPLICATION_CREDENTIALS=fullpath.json
Also there another options where you can try to run:
gcloud auth activate-service-account
Also you can add it to your script:
provider "google" {
credentials = file(var.service_account_file_path)
project = var.project_id
}
Don't forget that you need to have the correct roles to delete the composer.
For more details about it you can check:
https://cloud.google.com/composer/docs/delete-environments#gcloud
https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/composer_environment
https://cloud.google.com/composer/docs/how-to/access-control?hl=es_419

Error loading Namespaces. Unauthorized: Verify you have access to the Kubernetes cluster

I have created a EKS cluster using the the command line eksctl and verified that the application is working fine.
But noticing a strange issue, when i try yo access the nodes in the cluster in the web browser i see the following error
Error loading Namespaces
Unauthorized: Verify you have access to the Kubernetes cluster
I am able to see the nodes using kubectl get nodes
I am logged in as the admin user. Any help on how to workaround this would be really great. Thanks.
You will need to add your IAM role/user to your cluster's aws-auth config map
Basic steps to follow taken from https://docs.aws.amazon.com/eks/latest/userguide/add-user-role.html
kubectl edit -n kube-system configmap/aws-auth
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
data:
mapRoles: |
- rolearn: <arn:aws:iam::111122223333:role/eksctl-my-cluster-nodegroup-standard-wo-NodeInstanceRole-1WP3NUE3O6UCF>
username: <system:node:{{EC2PrivateDNSName}}>
groups:
- <system:bootstrappers>
- <system:nodes>
mapUsers: |
- userarn: <arn:aws:iam::111122223333:user/admin>
username: <admin>
groups:
- <system:masters>
- userarn: <arn:aws:iam::111122223333:user/ops-user>
username: <ops-user>
groups:
- <system:masters>
Also seeing this error and it got introduced by the latest addition to EKS, see https://aws.amazon.com/blogs/containers/introducing-the-new-amazon-eks-console/
Since then, the console makes requests to EKS in behalf of the user or role you are logged in.
So make sure the kube-system:aws-auth configmap has that user or role added.
This user/role might not be the same you are using locally with AWS CLI, hence kubectl might work while you still see that error !
Amazon added recently (2020.12) new feature that allows you to browse workloads inside cluster from Aws Console.
If you miss permissions you will get that error.
What permissions are needed is described here
https://docs.aws.amazon.com/eks/latest/userguide/security_iam_id-based-policy-examples.html#policy_example3
This might as well be because you created the AWS EKS cluster using a different IAM user than the one currently logged into the AWS Management Console hence the IAM user currently logged into the AWS Management Console does not have permissions to view the namespaces on the AWS EKS cluster.
Try logging in to the AWS Management Console using the IAM user credentials of the user who created the AWS EKS cluster, the issue should be fixed.

Cloudbuild having issue when running the kubelet cloudbuilder

I am trying to deploy the new changes to kubernetes cluster using the Google Cloud Provider CloudBuild. Whenever I make some changes the trigger is working fine and its starting a new build but here is the issue I am getting with this cloudbuild.yaml.
cloudbuild.yaml
steps:
#step1
- name: 'gcr.io/cloud-builders/docker'
args: [ 'build', '-t', 'gcr.io/$PROJECT_ID/cloudbuildtest-image', '.' ]
#step 2
- name: 'gcr.io/cloud-builders/docker'
args: ['push', 'gcr.io/$PROJECT_ID/cloudbuildtest-image']
#step 3 for testing
name: 'gcr.io/cloud-builders/kubectl'
args: ['get', 'pods']
env:
- 'CLOUDSDK_COMPUTE_ZONE=us-central1-a'
- 'CLOUDSDK_CONTAINER_CLUSTER=cloudbuild-test'
#STEP-4
images:
- 'gcr.io/$PROJECT_ID/cloudbuildtest-image'
Step 1 and 2 are working fine but the issue is with the step3 where for testing purpose I simply ran the get pods command to test if it will work or not. Here is the issue I am getting in the logs.
Running: gcloud container clusters get-credentials --project="journeyfoods-io" --zone="us-central1-a" "cloudbuild-test"
Fetching cluster endpoint and auth data.
ERROR: (gcloud.container.clusters.get-credentials) ResponseError: code=403, message=Required "container.clusters.get" permission(s) for "projects/XXXX/zones/us-central1-a/clusters/cloudbuild-test".
What permissions is it looking for? Do I need to do some
authentication before running the steps or What exactly am I missing?
The steps of a Cloud Build build are executed using with the [PROJECT_NUMBER]#cloudbuild.gserviceaccount.com service account. From the Cloud Build documentation page about this topic:
When you enable the Cloud Build API, the service account is
automatically created and granted the Cloud Build Service Account role
for your project. This role is sufficient for several tasks,
including:
Fetching code from your project's Cloud Source Repository
Downloading files from any Cloud Storage bucket owned by your project
Saving build logs in Cloud Logging
Pushing Docker images to Container Registry
Pulling base images from Container Registry
But this service account does not have permissions for certain actions by default ( in particular the container.clusters.get permission is not grated by default). So you need to grant it with a proper IAM role. In your case the Kubernetes Engine Developer role contains the container.clusters.get permission as you can see in this page.

"kubectl" not connecting to aws EKS cluster from my local windows workstation

I am trying to setup aws EKS cluster and want to connect that cluster from my local windows workstation. Not able to connect that. Here are the steps i did;
Create a aws service role (aws console -> IAM -> Roles -> click "Create role" -> Select AWS service role "EKS" -> give role name "eks-role-1"
Create another user in IAM named "eks" for programmatic access. this will help me to connect my EKS cluster from my local windows workstation. Policy i added into it is "AmazonEKSClusterPolicy", "AmazonEKSWorkerNodePolicy", "AmazonEKSServicePolicy", "AmazonEKS_CNI_Policy".
Next EKS cluster has been created with roleARN, which has been created in Step#1. Finally EKS cluster has been created in aws console.
In my local windows workstation, i have download "kubectl.exe" & "aws-iam-authenticator.exe" and did 'aws configure' using accesskey and token from step#2 for the user "eks". After configuring "~/.kube/config"; i ran below command and get error like this:
Command:kubectl.exe get svc
output:
could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
could not get token: NoCredentialProviders: no valid providers in chain. Deprecated.
For verbose messaging see aws.Config.CredentialsChainVerboseErrors
Unable to connect to the server: getting credentials: exec: exit status 1
Not sure what wrong setup here. Can someone pls help? I know some of the places its saying you have to use same aws user to connect cluster (EKS). But how can i get accesskey and token for aws assign-role (step#2: eks-role-1)?
For people got into this, may be you provision eks with profile.
EKS does not add profile inside kubeconfig.
Solution:
export AWS credential
$ export AWS_ACCESS_KEY_ID=xxxxxxxxxxxxx
$ export AWS_SECRET_ACCESS_KEY=ssssssssss
If you've already config AWS credential. Try export AWS_PROFILE
$ export AWS_PROFILE=ppppp
Similar to 2, but you just need to do one time. Edit your kubeconfig
users:
- name: eks # This depends on your config.
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
command: aws-iam-authenticator
args:
- "token"
- "-i"
- "general"
env:
- name: AWS_PROFILE
value: "<YOUR_PROFILE_HERE>" #
I think i got the answer for this issue; want to write down here so people will be benefit out of it.
When you first time creating EKS cluster; check from which you are (check your aws web console user setting) creating. Even you are creating from CFN script, also assign different role to create the cluster. You have to get CLI access for the user to start access your cluster from kubectl tool. Once you get first time access (that user will have admin access by default); you may need to add another IAM user into cluster admin (or other role) using congifMap; then only you can switch or use alternative IAM user to access cluster from kubectl command line.
Make sure the file ~/.aws/credentials has a AWS key and secret key for an IAM account that can manage the cluster.
Alternatively you can set the AWS env parameters:
export AWS_ACCESS_KEY_ID=xxxxxxxxxxxxx
export AWS_SECRET_ACCESS_KEY=ssssssssss
Adding another option.
Instead of working with aws-iam-authenticator you can change the command to aws and replace the args as below:
- name: my-cluster
user:
exec:
apiVersion: client.authentication.k8s.io/v1alpha1
args: #<--- Change the args
- --region
- <YOUR_REGION>
- eks
- get-token
- --cluster-name
- my-cluster
command: aws #<--- Change to command to aws
env:
- name: AWS_PROFILE
value: <YOUR_PROFILE_HERE>