Attaching volume to already existing folder deleted all the data in digital ocean - digital-ocean

I bought a 100 gb volume and accidentally attached it to my var folder.
Now all the previous data is deleted
command I ran mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_volume-sgp1-01 /var
How can I undo this last action in digital ocean

Unmounting returns you all of the data
Get the volume’s mount point with df if you don’t already know it sudo df --human-readable --print-type
Unmount the volume with umount. sudo umount --verbose /mnt/use_your_mount_point
if it shows that "device is busy", kill the processes
get the process by sudo lsof +f -- /mnt/use_your_mount_point
kill the process found ps -ef | grep 1725 <--> ps -ef | grep <pid> AND/OR kill -9 pid

Related

AWS EC2 terminal session terminated with "Plugin with name Standard_Stream not found"

I was streaming Kafka on AWS EC2 CentOS 7. My Session Manager Idle Timeout is set to 60min. And yet, after running for much less than that, the terminal got frozen, saying My session has been terminated. Of course, the Kafka streaming for disrupted as well.
When I tried to restart a new session with a new terminal, I got this error popup
Your session has been terminated for the following reasons: Plugin with name Standard_Stream not found. Step name: Standard_Stream
and I am still unable to restart a terminal.
What does this error mean and how to resolve it? Thanks.
So far you need to access the EC2 using SSH with key-pem to debug
(ask your admin)
Running tail -f got issue
tail: inotify resources exhausted
tail: inotify cannot be used, reverting to polling
Restart ssm-agent service also got issue No space left on device
but it's not about disk space
[root#env-test ec2-user]# systemctl restart amazon-ssm-agent.service
Error: No space left on device
[root#env-test ec2-user]# df -h |grep dev
devtmpfs 32G 0 32G 0% /dev
tmpfs 32G 0 32G 0% /dev/shm
/dev/nvme0n1p1 100G 82G 18G 83% /
So the error itself means that system is getting low on inotify
watches, that enable programs to monitor file/dirs changes. To see
the currently set limit (including output on my machine)
$ cat /proc/sys/fs/inotify/max_user_watches
8192
Check which processes using inotify to improve your apps or increase max_user_watches
for foo in /proc/*/fd/*; do readlink -f $foo; done | grep inotify | sort | uniq -c | sort -nr
5 /proc/1/fd/anon_inode:inotify
2 /proc/7126/fd/anon_inode:inotify
2 /proc/5130/fd/anon_inode:inotify
1 /proc/4497/fd/anon_inode:inotify
1 /proc/4437/fd/anon_inode:inotify
1 /proc/4151/fd/anon_inode:inotify
1 /proc/4147/fd/anon_inode:inotify
1 /proc/4028/fd/anon_inode:inotify
1 /proc/3913/fd/anon_inode:inotify
1 /proc/3841/fd/anon_inode:inotify
1 /proc/31146/fd/anon_inode:inotify
1 /proc/2829/fd/anon_inode:inotify
1 /proc/21259/fd/anon_inode:inotify
1 /proc/1934/fd/anon_inode:notify
Notice that the above inotify list include PID of ssm-agent
processes, it explains why we got issue with SSM when
max_user_watches reached limit
ps -ef | grep ssm-ag
root 3841 1 0 00:02 ? 00:00:05 /usr/bin/amazon-ssm-agent
root 4497 3841 0 00:02 ? 00:00:33 /usr/bin/ssm-agent-worker
Final Solution: Permanent solution (preserved across restarts)
echo "fs.inotify.max_user_watches=1048576" >> /etc/sysctl.conf sysctl -p
Verify:
$ aws ssm start-session --target i-123abc456efd789xx --region ap-northeast-2
Starting session with SessionId: userdev-03ccb1a04a6345bf5
sh-4.2$
This issue comes from EC2 instance not about SSM agent Go to link to
undestanding SSM agent.
optional link
In my case, extend the disk space works!
(syslog full of my case)
In my case too extending the disk space worked as my /var/logs was huge.

Shell script stops when calling SSH

I am attempting to automate a few things on AWS with one script.
log in and shut down docker-compose then remove all images
copy local files to server
log in and start docker-compose
My script is
#log in and shut down docker-compose then remove all images
ssh -i "~/Documents/AWS-Keys/mykey.pem" ubuntu#XX.XXX.XX.XXX
docker-compose down
docker image prune -f
exit
#copy local files to server
scp -r -i "~/Documents/AWS-Keys/mykey.pem" ./ubuntu ubuntu#XX.XXX.XX.XXX:/home
#log in and start docker-compose
ssh -i "~/Documents/AWS-Keys/mykey.pem" ubuntu#XX.XXX.XX.XXX
docker-compose up -d
exit
I have also tried logout instead of exit, same result.
Running
$ ./upload.sh
The output is:
Welcome to Ubuntu 20.04.2 LTS (GNU/Linux 5.4.0-1038-aws x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
System information as of Tue Mar 2 21:52:40 UTC 2021
System load: 0.07
Usage of /: 66.0% of 7.69GB
Memory usage: 36%
Swap usage: 0%
Processes: 115
Users logged in: 1
IPv4 address for xxxxxxxxxxxxxxx: XXX.XX.X.X
IPv4 address for docker0: XXX.XX.X.X
IPv4 address for eth0: XXX.XX.X.XXX
* Introducing self-healing high availability clusters in MicroK8s.
Simple, hardened, Kubernetes for production, from RaspberryPi to DC.
https://microk8s.io/high-availability
3 updates can be installed immediately.
0 of these updates are security updates.
To see these additional updates run: apt list --upgradable
Last login: Tue Mar 2 21:51:47 2021 from XXX.XX.X.XXX
ubuntu#ip-XXX.XX.X.XXX:~$
After getting some feedback I also tried
ssh -i "~/Documents/AWS-Keys/mykey.pem" ubuntu#XX.XXX.XX.XXX
docker-compose down;
docker image prune -f;
exit
Same result.
My understanding is that you want to run the command on the server, in that case just write it after ssh:
ssh -i "~/Documents/AWS-Keys/mykey.pem" ubuntu#XX.XXX.XX.XXX "docker-compose down ;docker image prune -f"
a longer script you can send via HEREDOC
ssh -i "~/Documents/AWS-Keys/mykey.pem" ubuntu#XX.XXX.XX.XXX <<COMMANDS
docker-compose down
docker image prune -f
COMMANDS

ssh tunnel script hangs forever on beanstalk deployment

I'm attempting to create a ssh tunnel, when deploying an application to aws beanstalk. I want to put the tunnel as a background process, that is always connected on application deploy. The script is hanging forever on the deployment and I can't see why.
"/home/ec2-user/eclair-ssh-tunnel.sh":
mode: "000500" # u+rx
owner: root
group: root
content: |
cd /root
eval $(ssh-agent -s)
DISPLAY=":0.0" SSH_ASKPASS="./askpass_script" ssh-add eclair-test-key </dev/null
# we want this command to keep running in the backgriund
# so we add & at then end
nohup ssh -L 48682:localhost:8080 ubuntu#[host...] -N &
and here is the output I'm getting from /var/log/eb-activity.log:
[2019-06-14T14:53:23.268Z] INFO [15615] - [Application update suredbits-api-root-0.37.0-testnet-ssh-tunnel-fix-port-9#30/AppDeployStage1/AppDeployPostHook/01_eclair-ssh-tunnel.sh] : Starting activity...
The ssh tunnel is spawned, and I can find it by doing:
[ec2-user#ip-172-31-25-154 ~]$ ps aux | grep 48682
root 16047 0.0 0.0 175560 6704 ? S 14:53 0:00 ssh -L 48682:localhost:8080 ubuntu#ec2-34-221-186-19.us-west-2.compute.amazonaws.com -N
If I kill that process, the deployment continues as expected, which indicates that the bug is in the tunnel script. I can't seem to find out where though.
You need to add -n option to ssh when run it in background to avoid reading from stdin.

How do I expand volume size on docker image

The default on FS /dev/mapper/docker-XXX is 10GB. I followed other instructions to edit /etc/sysconfig/docker-storage and add --storage-opt dm.basesize=50G. Next I do:
sudo service docker restart
sudo service ecs restart
I can see
# ps -ef | grep docker | grep stor
root 5966 1 0 21:45 pts/0 00:00:01 /usr/bin/dockerd --default-ulimit nofile=1024:4096 --storage-driver devicemapper --storage-opt dm.basesize=50G --storage-opt dm.thinpooldev=/dev/mapper/docker-docker--pool --storage-opt dm.use_deferred_removal=true --storage-opt dm.use_deferred_deletion=true --storage-opt dm.fs=ext4
So it looks like it took effect, however when I look into the running docker container it is stll 10GB:
# docker exec -it 601f6a9e9418 bash
root#601f6a9e9418:/# df
Filesystem 1K-blocks Used Available Use% Mounted on
/dev/mapper/docker-202:1-263443-880571d796b21f307753d4f4ecca2141b85119985fac00001ea2622ce643b45f 10190136 7295128 2354336 76% /
Any help is greatly appreciated.
try this :
link : How to increase Docker container default size?
(optional) If you have already downloaded any image via docker pull you need to clean them first - otherwise they won't be resized
docker rmi your_image_name
Edit the storage config
vi /etc/sysconfig/docker-storage
There should be something like DOCKER_STORAGE_OPTIONS="...", change it to DOCKER_STORAGE_OPTIONS="... --storage-opt dm.basesize=100G"
Restart the docker deamon
service docker restart
Pull the image
docker pull your_image_name
(optional) verification
docker run -i -t your_image_name /bin/bash
df -h
I was struggling with this a lot until I found out this link http://www.projectatomic.io/blog/2016/03/daemon_option_basedevicesize/ turns out you have to remove/pull image after enlarging the basesize.

How to stop gunicorn properly

I'm starting gunicorn with the Django command python manage.py run_gunicorn. How can I stop gunicorn properly?
Note: I have a semi-automated server deployment with fabric. Thus using something like ps aux | grep gunicorn to kill the process manually by pid is not an option.
To see the processes is ps ax|grep gunicorn and to stop gunicorn_django is pkill gunicorn.
One option would be to use Supervisor to manage Gunicorn.
Then again i don't see why you can't kill the process via Fabric.
Assuming you let Gunicorn write a pid file you could easily read that file in a Fabric command.
Something like this should work:
run("kill `cat /path/to/your/file/gunicorn.pid`")
pkill gunicorn
or
pkill -P1 gunicorn
should kill all running gunicorn processes
pkill gunicorn stops all gunicorn daemons. So if you are running multiple instances of gunicorn with different ports, try this shell script.
#!/bin/bash
Port=5000
pid=`ps ax | grep gunicorn | grep $Port | awk '{split($0,a," "); print a[1]}' | head -n 1`
if [ -z "$pid" ]; then
echo "no gunicorn deamon on port $Port"
else
kill $pid
echo "killed gunicorn deamon on port $Port"
fi
ps ax | grep gunicorn | grep $Port shows the daemons with specific port.
Here is the command which worked for me :
pkill -f gunicorn
It will kill any process with the name gunicorn
Start:
gunicorn --pid PID_FILE APP:app
Stop:
kill $(cat PID_FILE)
The --pid flag of gunicorn requires a single parameter: a file where the process id will be stored. This file is also automatically deleted when the service is stopped.
I have used PID_FILE for simplicity but you should use something like /tmp/MY_APP_PID as file name.
If the PID file exists it means the service is running. If it is not there, the service is not running. To stop the service just kill it as mentioned.
You could also want to include the --daemon flag in order to detach the process from the current shell.
To start the service which is running on gunicorn
sudo systemctl enable myproject
sudo systemctl start myproject
or
sudo systemctl restart myproject
But to stop the service running on gunicorn
sudo systemctl stop myproject
to know more about python application hosting using gunicorn please refer here
kill -9 `ps -eo pid,command | grep 'gunicorn.*${moduleName:appName}' | grep -v grep | sort | head -1 | awk '{print $1}'`
ps -eo pid,command will only fetch process id, command and args out
grep -v grep to get rid of output like 'grep --color=auto xxx'
sort | head -1 to do ascending sort and get first line
awk '{print $1}' to get pid back
One more thing you may need to pay attention to: Where gunicorn is installed and which one you're using?
Ubuntu 16 has gunicorn installed by default, the executable is gunicorn3 and located on /usr/bin/gunicorn3, and if you installed it by pip, it's located on /usr/local/bin/gunicorn. You would need to use which gunicorn and gunicorn -v to find out.
In your terminal, do:
ps ax|grep gunicorn
Then to kill the Gunicorn process, just do that:
kill -9 <gunicorn pid number>
In my case I dealt with many processes
For example: kill -9 398 399 4225 4772
The above solutions does not remove pid file when the process is killed.
cat <pid-file> | xargs kill -2
This solution reads pid file and send interrupt signal. This closes gunicorn properly and pid file is also removed.
PID file can be generated by
gunicorn --pid PID-FILE
or by adding the following in config file
pidfile = "pid_file"
If we run:
pkill gunicorn
We stop all gunicorn services, in this case to start gunicorn we only need to stop the parent process associated with the service that attends the port where gunicorn will be executed.
The following script searches for said process (pid), if it exists it kills this process:
#!/bin/bash
# ---------------------
stop_unicorn_on_port() {
pid=$(lsof -w -t -i "TCP:${1}" | head -1)
if [ -z "${pid}" ]; then
echo "🦄 no service deamon on port ${1}"
else
kill -9 "${pid}"
echo "🦄 killed service deamon(${pid}) on port ${1}"
fi
}
# Example/Testing
stop_unicorn_on_port 5000
stop_unicorn_on_port 5001
stop_unicorn_on_port 5002
more info check: man lsoft
-t specifies that lsof should produce terse output with process identifiers only and no header - e.g., so
that the output may be piped to kill(1). -t selects the -w option.
-iselects the listing of files any of whose Internet address matches the address specified in i. If no
address is specified, this option selects the listing of all Internet and x.25 (HP-UX) network files...
Here are some sample addresses:
-i6 - IPv6 only
TCP:25 - TCP and port 25
#1.2.3.4 - Internet IPv4 host address 1.2.3.4
I built upon #David's recommendation to use --pid (PID_FILE) to fix the problem I faced because killing the parent pid didn't kill worker processes.
import os
import sys
import psutil
def stop_pid(pid):
if sys.platform == 'win32':
p = psutil.Process(pid)
p.terminate() # or p.kill()
else:
os.system('kill -9 {0}'.format(pid))
def get_child_pids(ppid):
pid_list = []
for process in psutil.process_iter():
_ppid = process.ppid()
if _ppid == ppid:
_pid = process.pid
pid_list.append(_pid)
return pid_list
def send_kill_cmd(ppid, cpids):
stop_pid(ppid) # Killing the parent proc first
for pid in cpids:
stop_pid(pid)
if __name__ == '__main__':
parent_pid = int(sys.argv[1])
child_pids = get_child_pids(parent_pid)
send_kill_cmd(parent_pid, child_pids)
Then finally excecuted above python script with below commands
#!/bin/bash
FILE_NAME=PID_FILE
if [ -f "$FILE_NAME" ]; then
pypy stop_gunicorn.py "$(cat PID_FILE)"
echo "killed - $(cat PID_FILE) and it's child processes."
sleep 2
fi
echo 'Starting gunicorn'
nohup gunicorn --workers 1 --bind 0.0.0.0:5050 app:app --thread 50 --worker-class eventlet --reload --pid PID_FILE > nohup_outs/nohup_process.out &