AWS snapshot did not recover images - amazon-web-services

I have a EC2 instance that I use as a WordPress server, and over the weekend, the db got corrupted. So today I had no choice but to recover it from a snapshot taken before the weekend. The recovery worked, and everything seems to be working fine except for one big thing, the images are not showing up. If I go to the website, the images are gone, and when I log in to the admin, the images are there, but grey.
link to image: https://imgur.com/a/TPV1HXZ
I followed the official guide for restoring, and everything else is working. Any ideas as to what I can do?

Related

Sagemaker studio does not load up

Sagemaker Studio worked flawlessly for me for the first 6 months. Then I started observing this issue. Screenshot of the error message
The screen holds up at this stage forever. Here's what I have tried:
Clearing my cache, even using a different machine. So I don't think the issue lies with the browser or my machine.
Pressing 'Clear workspace' in the screenshot above.
Shutting down all the apps in my sagemaker domain (excluding the 'default' app). This used to work initially but now this has stopped working all-together.
Created a new sagemaker domain with fraction of the files in the previous domain. Still, I see the same error message in the new domain as well.
This is severely affecting my work and I can't find a solution for this anywhere on the internet.
I've seen this issue before. Restarting the JupyterServer app and clearing out the EFS storage attached to your Studio domain helps.
Also, it is worth checking the JupyterServer logs to see what is causing the issue. Additionally, please make sure to use the latest version of Studio.

Elasticsearch Indices keeps getting lost

I have an Elasticsearch cluster on AWS ec2 instance. It is t3.small with 2 vcores and 2 GB ram. I have installed Elasticsearch & kibana. For extensions, I have installed Heartbeat and Metricbeat. The database I'm working with is mongo DB and all my data is no-SQL. I feed my engine from my MongoDB cluster which is in my local machine with a script. I feed my engine and run the queries from my app and also from the console. So far so good. Everything is fine. Well, the cluster is always yellow it is not green.
The problem starts after hitting multiple requests on the engine. After 50 or 60 search queries the data just disappears. Well somehow my engine is forcefully dumping my indices and it's not being able to restore those data ( obviously I have no snapshot and no restore point ) and I keep getting lose those data. I have to manually feed the engine again and again. Well at first I had 1 GB ram so I thought upgrading would fix the issue but after upgrading to 2 GB ram it didn't stop. Well, now the data stays there for some more time.
So here are my DB configs.
I have 70K + no SQL documents.
Which contains text and geo_point types
I make post request on my engine through my front end application.
I don't have logstash installed, but metricbeat is not showing any error logs.
All my elastic search engine setup is for Testing purposes this is not the production mode.
We will upgrade when we go to the production mode.
So I need to know
What is the reason behind this and
how to prevent this huge data loss
So please help me or just suggest how to solve this huge problem.
Thank you
Ideally first thing you should do is to make cluster green.
To see the exact elasticsearch error that is causing this situation, you should look at elasticsearch.log file. It will contain exact error causing it.
One way to keep cluster data safe is to take regular snapshots and restore incase of data loss. Details of snapshots procedure can be found here.

Google Cloud VM Files Deleted after session disconnect

I am having some of my GCP instances behave in a way similar to what is described in the below link:
Google Cloud VM Files Deleted after Restart
The session gets disconnected after a small duration of inactivity at times. On reconnecting, the machine is as if it is freshly installed. (Not on restarts as in the above link). All the files are gone.
As you can see in the attachment, it is creating the profile directory fresh when the session is reconnected. Also, none of the installations I have made are there. Everything is lost including the root installations. Fortunately, I have been logging all my commands and file set ups manually on my client. So, nothing is lost, but I would like to know what is happening and resolve this for good.
This has now happened a few times.
A point to note is that if I get a clean exit, like if I properly logout or exit from the ssh, I get the machine back as I have left, when I reconnect. The issue is there only when the session disconnects itself. There have been instances where the session disconnected and I was able to connect back as well.
The issue is not there on all my VMs.
From the suggestions from the link I have posted above:
I am not connected to the cloud shell. i am taking ssh of the machine using the chrome extension
Have not manually mounted any disks (afaik)
I have checked the logs from gcloud compute instances get-serial-port-output --zone us-east4-c INSTANCE_NAME. I could not really make much of it. Is there anything I should look for specifically?
Any help is appreciated.
Please find the links to the logs as suggested by #W_B
Below is from 8th when the machine was restarted and files deleted
https://pastebin.com/NN5dvQMK
It happened again today. I didn't run the command immediately then. The below file is from afterwards though
https://pastebin.com/m5cgdLF6
The below one is after logout today.
[4]: https://pastebin.com/143NPatF
Please note that I have replaced the user id, system name and a lot of numeric values in general using regexp. So, there is a slight chance that the time and other values have changed. Not sure if that would be a problem.
I have added the screenshot of the current config from the UI
Using locally attached SDD seems to be the cause ... here it is explained:
https://cloud.google.com/compute/docs/disks/local-ssd#data_persistence
You need to use a "persistent disk" - else it will behave just as you describe it.

Ipython notebook remote server on AWS

I'm running a remote IPython notebook server on an EC2 instance on AWS. The instance is running Ubuntu.
Followed this tutorial to set up, and everything seems to work - I can access the notebook via https with a password and run code.
However, I can't seem to save changes to the notebook - It says "saving notebook" and then nothing happens (i.e, still written 'unsaved changes' on top).
Any ideas would be greatly appreciated.
Edit: It's not a permissions problem, since running in sudo doesn't help.
When creating a new notebook in the remote server, I am able to save. Problem only occurs for notebooks pulled from my git repository. Also, when opening a problematic notebook, and deleting all cells until it's absolutely empty, I can sometimes (!) save the empty notebook, and sometimes (!!) I still can't.
I've encountered an issue where notebooks wouldn't save on the nbserver on AWS EC2 instance I set up in a similar manner via different tutorial. It turns out I had to refresh and re-login using the password, because my browser would automatically log out have a certain period. Might help if you close and re-attempt to go the the nbserver and see if it asks you to re-login.
Here's a few other things you can try:
try to copy a problematic notebook into the server (scp) and try to open+save, as opposed to going thru repo pull to see if anything changes
check if the hanging "saving notebook" message appear for notebooks in certain directories
check the ipython console messages when you save a problematic notebook and see if anything there can help you pinpoint the issue

Django pages loading very slowly in EC2

I'm at a loss here.
I am attempting to transfer a Django application to EC2. I ave moved the DB to RDS(Postgres image) and have static and media on S3.
However for some reason, all my pages are taking 25-30 seconds to load. I have checked the images and CPU and memory barely blips. I checked and took off KeepAlive in Apache, and changed the WSGI to work in daemon mode, but none of this made any difference. I have gone into the shell on the machine and accessed the DB and that appears to be reacting fine as well. I ahve also increased the EC2 image, with no effect.
S3 items are also being delivered quickly and without issue. Only the rendering of the html is taking long times.
On our current live and test server, there are no issues with the pages which load in ms
Can anyone point me to where or what I should be looking at?
Marc
The issue appeared to be connected with using RDS. I installed Postgres on the EC2 image and appart from a little mucking around it worked fine on there.
I'm going to try building a new RDS, but that was the issue here. Strange it worked ok directly via manage.py shell