Unable to sudo to Deep Learning Image - google-cloud-platform

I installed the latest Google Cloud Deep Learning VM Image today, after VM was launched, I was able to do sudo -i successfully via SSH web.
Once I login, I start my Tensorflow model training running in background (Using &). Few hours later I'm unable to login as root.
I get the following message:
We trust you have received the usual lecture from the local System
Administrator. It usually boils down to these three things:
#1) Respect the privacy of others.
#2) Think before you type.
#3) With great power comes great responsibility.
[sudo] password for my_username:
I tried:
sudo -i
su sudo -i
su root
I was able to replicate the issue. Any suggestions?

This issue was caused due to an internal Google side and removes the user from “Google-sudoers” group. For all affected instances, I suggest following the below workaround until the permanent fix has been rolled out.
Use a different username:
If using browser SSH window, click on the settings icon (top right), and click change Linux name in the drop down.
Using the SDK
$ gcloud compute ssh newusername#instance
Enable OS Login on the instance (set "enable-oslogin=True" in metadata) and per this article
You can track the permanent fix by following the Public Issue tracker.

The original answer:
Maybe the solution will be to add a SSH Key for Google Cloud Console and log in with another SSH client.
Additional answer:
I do not know why, but sometime the user suddenly stopped being a member of the google-sudoers group...
Then it's enough add your user to this group by some other user with administrator privileges to this group:
# usermod -G google-sudoers your_user_name
of course, if there is such a user...

Related

How to Connect Secure Shell App to a Google Cloud VM Instance

I would like to connect to a Google Cloud VM instance using Secure Shell App (SSA). I assumed this would be easy as these are both Google products and I had no problem before connection SSA to a Digital Ocean Droplet. I have found Google's own documentation to do so here and it looked easy enough to follow. However, the following link in the instructions: Providing public SSH keys to instances leads down a rabbit hole of confusing and seemingly self-contradicting information. I tried to follow it the best I could but kept running into errors. I have searched in vain for better instructions and am still astounded that Google has made it so hard to connect their own products. Is it really this hard to make this work? Are there any better instructions out there? If not, would someone be willing to write up clear and simple instructions?
Please follow this step by step instruction:
create a new VM instance-1
connect to it with gcloud compute ssh instance-1 (as mentioned #John Hanley)
check ~/.ssh folder
$ ls -l ~/.ssh
-rw------- 1 user usergroup 1856 Dec 9 17:12 google_compute_engine
-rw-r--r-- 1 user usergroup 417 Dec 9 17:12 google_compute_engine.pub
copy keys
cp ~/.ssh/google_compute_engine.pub mykey.pub
cp ~/.ssh/google_compute_engine mykey
follow instructions from step 7 - create connection and import identity
(optional) if you don't find your mykey in the Indentity list try to connect anyway (ended with an error as expected), then restart Secure Shell App and check Indentity menu again (they should be there without redoing import again)
After that, I successfully connected to my VM via Secure Shell App.

using oslogin on gcp with osAdminLogin role a user can't sudo on the instance

I have some GCP users with the roles :
* compute.instances.osAdminLogin
* iam.serviceAccountUser
They connect throw ssh with the GCP web interface in compute engine
When they do sudo ls
For some user the password is requested and some not.
in the folder /var/google-sudoers.d/
for the users that can do sudo without the prompt we can read on their file:
user_name ALL=(ALL) NOPASSWD: ALL
for the others the files are empty
os information :
uname -a
Linux xxx 4.15.0-1027-gcp #28~16.04.1-Ubuntu SMP Fri Jan 18 10:10:51 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
For the same users, on another vm, in the same gcp project, they all can do sudo.
I am expecting that for all users having the same roles, they have the same sudo behaviour on instances.
What should I do for my users to be able to sudo ? ( except overriding the empty files in the folder /var/google-sudoers.d/ > that is working but may not be stable)
I had a similar problem on a project that was originally set up with the legacy login system (based on SSH keys stored in instance or project metadata). When I converted the project to use OS Login, I lost the ability to sudo without a password on one VM instance. This was a major problem, since I had never set a password for my user account, and therefore was unable to sudo to troubleshoot the problem.
Things I tried that did NOT work:
Rebooting the instance
Explicitly adding role roles/compute.osAdminLogin to my IAM account (I was already a project owner)
I solved the problem by editing the project compute engine metadata to disable OS Login. After disabling, I confirmed that I was able to log into the problematic instance and sudo without a password. I then edited the project metadata again to re-enable OS Login. This time, passwordless sudo worked on the problematic instance. It appears that the instance was not fully reconfigured the first time I switched from legacy login to OS Login.

Can connect with existing SSH key but not a new one

I'm trying to give access to a GCE VM by adding SSH Keys to the project metadata. My current SSH key is in the project metadata and I can connect just fine using:
ssh -i ~/.ssh/<private_key> <username>#<instance_ip>
Now, I generated another key:
ssh-keygen -t rsa -f ~/.ssh/<new_key> -C <new_username>
After adding the generated public key to the project metadata, I then run:
ssh -i ~/.ssh/<new_private_key> <new_username>#<instance_ip>
But I get Permission denied (publickey,gssapi-keyex,gssapi-with-mic). Running with -vvv flag doesn't show me much besides the key being rejected.
Things I know/checked:
firewall isn't an issue because I can connect using my original key from same location
the instance is running SSH (running nc <instance_nat_ip> 22 shows "OpenSSH" etc.)
no passphrases were used with the generation of any SSH key
there are no instance-level restrictions on project-wide metadata
there are no instance-level ssh keys already added
there are no newlines/breaks causing the key to be malformed
permissions on ~./ssh aren't an issue since another key pair works fine from the same directory, additionally, both key pairs have the same permissions anyways
OSLogin isn't enabled either on the project or instance
Things I've tried:
removing and readding the SSH keys in project metadata
trying with new key pairs generate on another person's machine
restarting sshd service
Questions:
Does the username specified during the ssh-keygen step have to already exist on the remote instance prior to adding the key to the metadata? i.e. do I have to run sudo useradd <new_username> while SSH'd into the instance Creating a new test instance revealed this to not be the case, all users in the project metadata were created automatically
why does my existing SSH key work and not new ones even though they are added the same way?
there's a chance the enable-oslogin:TRUE was applied to the instance briefly a long time ago (I'm not sure since I'm not the one who created the instance) but it's no longer there in the instance or project metadata. Would having that been enabled, even briefly, cause some issues?
EDIT: I started up a new instance in the same project with the same network details and I was able to SSH to that instance using the new key. Original instance is still denying the key
Did some digging around and found out that the systemd service that propagates accounts information from the metadata server is a daemon called google-accounts-daemon.
When I ran sudo ps aux | grep daemon I didn't see it running as I did on the test instance I created.
So when I ran sudo systemctl restart google-accounts-daemon the SSH keys magically propagated and everything worked.
I have no idea what caused the daemon to stop running in the first place, so if anyone has ideas, that'd be appreciated in case this comes up in the future.

I can't connect VM on GCP as root

I can't connect VM on GCP as root on the browser SSH.
Is there anyone who had the same problem?
the following message is displayed.
You can drastically improve your key transfer times by migration to OS login.
It might be caused to set a password...
By default, you will login as the GCP user. Now, to log in as root please run the following command once SSH browser works.
sudo -s
If you cannot login with browser SSH, then I suspect a permission issue with that particular user.
The above is the recommended way of doing things, however if logging in as root is absolutely needed, please follow the steps below:
As root, edit the sshd_config file in /etc/ssh/sshd_config:
nano /etc/ssh/sshd_config
Make sure PermitRootLogin is set to “yes” and save the /etc/ssh/sshd_config file.
Restart the SSH server:
service sshd restart
Change username to root by clicking on the wheel on the top right corner and selecting “Change Linux Username"

Jenkins AccessDeniedException upon trying to enable security on Jenkins on an EC2

Last night when I was trying to set up Jenkins, from the jenkins.war file, I was trying to enable security, via username/password for it. I clicked the "Disable read access to anonymous" checkbox, and right after doing that, I got this screen , even after logging in with the new credentials I just created. I have tried the following (which has resulted in this screen still):
removing anything on the EC2 that had to deal with Jenkins (sudo find / -name "*jenkins*" followed by sudo rm [-rf] on anything that popped up in the results)
re-visiting that site after doing the above option
re-installing the WAR file
installing Jenkins as a service
attempting login again
Is there a way out of this?
I should have checked the processes and killed the one that was Jenkins. The process somehow outlived its JAR/WAR executable!