Often times, some rogue processes gets in a busy spin mode using up 100% of the CPUs. I have a GCP Ubunutu instance with 4 CPU Cores and 32 Gigs of RAM. I still get into this situation of 100% CPU usage and I can't even SSH into the VM instance.
Does GCP provide a way of killing the offending process? Through gcloud SDK command or web console?
As Serhii Rohoza mentioned, GCP does not provide you any tool to kill proccess.
Instead, you can SSH your VM instance and figure out what process is eating away your CPU and stop it, by executing this commands:
Open a terminal with Ctrl+Alt+t
Execute the command "top"
Note the process using the most CPU
If the process isn't a system process, kill it with "sudo pkill [processname]"
where [processname] is the name of the process you want to kill.
If it is a system process, don't kill it, but try to Google the name of it and figure out what functionality it does in Ubuntu.
Related
I have installed GCP monitoring and logging agent on my compute engine instance. It has increased memory consumption more than 50% from the time it was installed.
Any way to stop the memory utilization and reset back to the initial memory consumption?
I have 3.75 GB RAM, out of which more than 3 GB consumed and more than 2 GB is being consumed by this process "/opt/google-fluentd/embedded/bin/ruby -Eascii-8bit:ascii-8bit /usr/sbin/google-fluentd --log /var/log/google-fluentd/google-fluentd.log --daemon /var/run/google-fluentd/google-fluentd.pid --under-supervisor"
Update:
After restart google-fluentd service, it brings down memory usage. But need to know the reason of its increased memory consumption. Is it a bug in fluentd service?
Yes, it seems to be a known issue. The Logging Agent product team is still working on finding a fix for this issue. You can track google-fluentd (monitoring service) memory usage increase for updates.
Meanwhile the only workaround to solve this is to schedule a cron job to restart the fluentd agent periodically.
To restart the agent periodically run the following command on your instance:
$ sudo service google-fluentd restart
Another recommendation is to check that there are not multiple Logging agent instances running on the VM (periodically).
Use ps -aux | grep "/usr/sbin/google-fluentd" to show running agent processes (there should be only two: one supervisor and one worker), and sudo netstat -nltp | grep :24231 to show running processes that occupy the port. Kill older instances as seen fit.
Edit :
Check whether your fluent-plugin-systemd version is upgraded to 1.0.5 by using the command:
$ /opt/google-fluentd/embedded/bin/gem list | grep fluent-plugin-systemd
If it is not upgraded to 1.0.5, you can upgrade using fluent-plugin-systemd 1.0.5.
If you have fluent-plugin-systemd 1.0.5 but are still seeing the issue, it might be the buffer output plugin issue that is still under investigation in https://github.com/fluent/fluentd/issues/3401
This question seems very basic, but I wasn't able to quickly find an answer at https://cloud.google.com/compute/docs/instances/create-start-instance. I'm running a MicroMDM server on a Google Cloud VM by connecting to is using SSH (from the VM instances page in the Google Cloud Console) and then running the command
> sudo micromdm serve
However, I notice that when I shut down my laptop, the server also stops, which is actually why I wanted to run the server in a VM in the first place.
What would be the recommended way to keep the server running? Should I use systemd or perhaps run the process as a Docker container?
When you run the service from the command line, you "attach" it to your shell process, when you terminate your ssh session, your job gets terminated also.
To make a process run in background, simply append the & at the end of the command, in your case:
sudo micromdm serve &
This way your server is alive even after you quit your session.
I also suggest you to add that line in the instance startup script, if you want that server to always be up, so that you don't have to run the command by hand each time :)
More on Compute Engine startup scripts here.
As the Using MicroMDM with systemd documentation, it suggested to use systemd command to run MicroMDM service on linux.First, on our linux host, we create the micromdm.service file, then we move it to the location ‘/etc/systemd/system/micromdm.service’ . We can start the service. In this way, it will keep the service running, or restart service after the service fails or server restart.
When I connect to EC2 instance via Mobaxterm, after some period of time my jupyter notebook's kernel loses connection.
And some highly time-consuming operations /(Currently running tasks) are required to be re-performed again and again and are never-ending (This repeats each and every time).
I'm closing the notebook and restarting, so I can gain a connection to the kernel because it doesn't reconnect and I had to go through the process again and again when it dies eventually.
It also shows SSL error, wrong version number sometimes before disconnecting.
I have also faced a similar problem. I solved it with the help of 'tmux'.
I followed these steps:
I installed 'tmux' in my machine in the AWS instance.
[Actually, it came preinstalled with the AMI I had been using on the EC2 instance.]
I created a 'tmux' session simply by entering the command: tmux
Then I ran necessary commands to run the Jupyter server or Jupyter notebook
To close the terminal, I used this command: (i) ctrl + b, (ii) d
[Please notice, the session will continue running on the EC2 instance until you close the instance or close the jupyter server or the jupyter notebook].
To connect to the session again, I used the command: tmux attach
To finally kill the 'tmux' session when I am done, I used the command: tmux kill-session
Just use nohup. This should be the builtin tool in all Linux machines.
So you should do: nohup jupyter notebook > output.txt
And then you can safely terminate the console session without worrying about killing the notebook.
I'm trying to run a big algorithm (ML) that takes about 4 hours in IPython. The problem is that my local CPU does not support this and hence I have to use AWS. My network isn't very stable and there are frequent disconnects from the server. So, my question is:
How can I run a cell from commandline (ssh) with nohup option so that it will continue running even after disconnecting from the IPYthon server? And how do I go back and fetch the results and kill the process at the end of it?
I'm trying to automate VMWare Desktop on Windows 7 to suspend all vm's before I do a backup job each night. I used to have a script that did this but I've noticed now that it won't suspend anymore with the same command that used to work.
If I do vmrun list I get a list of the running vms with no issue.
If I do vmrun suspend "V:\Virtual Machines\RICHARD-DEV\RICHARD-DEV.vmx" it just hangs and I have to kill the command with CTRL+C.
I've even tried a newer command using -T to specify it's workstation, ie vmrun -T ws suspend "V:\Virtual Machines\RICHARD-DEV\RICHARD-DEV.vmx" and still no love.
If I have the vm already stopped, I can issue vmrun start "V:\Virtual Machines\RICHARD-DEV\RICHARD-DEV.vmx" and it starts fine.
As well as the suspend command, the stop command also does not work. I'm running VMWare Workstation 11.1.3 build-3206955 on Windows 7.
Any ideas?
Update:
I installed latest VMWare Tools on the guest, as well as the latest Vix on the Host so everything should be up to date.
I can start a vm using vmrun with no problem using vmrun -T ws start <path to vmx> but the command doesn't come back to the command prompt, so I'm assuming it's not getting confirmation from the vm that it is now running.
If I cancel the 'start' command and now try and suspend I'm getting the same lack of communication from the guest. If I manually suspend the vm, once it's suspended I get an 'Error: vm is not running' and the 'suspend' command finally times out and comes back.
So, it looks to me like there is no communication from vmrun to the guest about what state it's in etc. Is there a way to debug the communication from the host to the guest using vmrun or other means? Are there ports I need open in the guest OS?
So, I never did get vmrun to work properly on my main system, although I did get it behave ok on my laptop so there is something weird happening on this machine. I also installed a trial of the latest VMWare 12 and the same thing happens.
As a workaround, I ended up changing the power management settings in my guest OS so that it would 'sleep' after 1 hr of inactivity. When this happens VMWare detects it and automatically suspends the guest which is really what I'm looking for. Not the most slick solution but it does manage to unlock the files I need to be backed up in a nightly backup.