We have two ec2 servers. One has the rabbitmq on it. Second one is a new one for storage purposes. Both of these are Amazon Linux 2.
On the second one we just purchased /dev/nvme1n1 70G 104M 70G 1% /data
Where we would love to push our rabbitmq queues and data. Basically we would like to RABBITMQ_MNESIA_DIR setup on the first rabbitmq server to be directly connecting and saving queues in /data remote mentioned.
Currently that is /var/lib/rabbitmq/mnesia and our config file for rabbitmq is just default /etc/rabbitmq/rabbitmq.conf
I wonder if somebody has been doing this before, or can point us in the right direction on how to set RABBITMQ_MNESIA_DIR to be directly connecting to remote ec2 and store and work with queues from there. Thank you
At the end of the day #Parsifal was right.
We ended up making one instance bigger and changed RABBITMQ_MNESIA_DIR
This was bit tricky, because after restarting service rabbitmq-server restart
First off was needed to make sure we had current right to the /data/mnesia we mounted, I managed it with chmod 755 -R /data though read/write should be sufficient based on docs.
Then we were looking for why it always produces the error like this "broker forced connection closure with reason 'shutdown'" & Error on AMQP connection so it was after the start.
So I figured and checked the ownership of the current mnesia dir and the new one. And turned out the user and group was root root compared to original one.
Switched it to drwxr-xr-x 4 rabbitmq rabbitmq 97 Dec 16 14:57 mnesia and this started working.
Maybe it will save you some headaches, I didn't realize there was a different user group for rabbitmq, since I didn't create it.
Only thing to add, is once you are shifting the current working mnesia you might consider copying the directory to the new one since there is a lot of stuff that was currently being used and ran from. I tried it without it and even the password to admin didn't work :D
Related
The title explains the question itself. My problem is every time I connect my VM machine through SSH it always timeouts after a period of time. So I'd like to let my Python script work on itself for like hours or days. Any advice? Thanks.
VM Instance will keep running even if your SSH times out.
You can keep the SSH session alive by adding following lines:
Host remotehost
HostName remotehost.com
ServerAliveInterval 240
to $HOME/.ssh/config file.
There's a similar option in PuTTy.
To keep process alive after disconnecting, you have multiple options, including those already suggested in commnets:
nohup
screen
setsid
cron
service/daemon
Decision which one to choose depends on specifics of the task that is being performed by the script.
Happy ugly Christmas sweater day :-)
I am running into some strange problems with my AWS Linux 16.04 instance running Hadoop 2.9.2.
I have just successfully installed and configured Hadoop to run in a simulated distributed mode. Everything seems to be fine. When I start hdfs and yarn I don't get any errors. But as soon as I try to do even something as simple as list the contents of the root hdfs directory, or create a new directory, the whole instance becomes super slow. I wait for about 10 min and it never produces a directory listing so I hit Ctrl+C and it takes another 5 minutes to kill the process. Then I try to stop both, the hdfs and yarn, and it succeeds but also takes a long time to do that. And even after hdfs and yarn have been stopped the instance is still being barely responsive. At this point all I can do to make it function normally again is to go to AWS console and restart it.
Does anyone have any idea what I might've screwed up ( I am pretty sure it's something I did. It usually is :-) )?
Thank you.
Well, I think I figured out what was wrong and the answer is trivial. Basically, my ec2 instance doesn't have enough RAM. It's a basic free tier eligible instance and by default it comes with only 1GB of RAM. Hilarious. Totally useless.
But I learned something useful anyway. One other thing I had to do to make my Hadoop installation work (I was getting "connection refused" error but I did make it work) was that in core-site.xml file I had to change the line that says
<value>hdfs://localhost:9000</value>
to
<value>hdfs://ec2-XXX-XXX-XXX-XXX.compute-1.amazonaws:9000</value>
(replace the XXXs in the above with your instance's IP address)
I am using Coldfusion MX8 server and one of the scheduled task was running from 2 years but now suddenly from 01/12/2014 scheduled tasks are not running. When i browsed the file in browser then the file is running successfully without error.
I am not sure is there any updatation or license expiration problem. I am aware that mid of this year Adobe closed the support for coldfusion 8.
The first most common problem of this problem is external to the server. When you say you browsed to the file and it worked in a browser, it is very important to know if that test was performed on the server desktop. Knowing that you can browse to the file from your desktop or laptop is of small value.
The most common source of issues like this is a change in the DNS or network stack that is interfereing with resolution. For example, if the internal DNS serving your DMZ suddenly starts serving the "external" address - suddenly your server can't browse to your domain. Or if the IP served by the server for the domain in question goes from being 127.0.0.1 to some other IP that the server can't acces correctly due to reverse proxy or LB or some other rule. Finally, sometimes the Apache or IIS is altered so that an IP that previously was serviced (127.0.0.1 being the most common example) now does not respond.
If it is something intrinsic to the scheduler service then Frank's advice is pretty good - especially look for "proxy schduler" entries in the log - they can give you good clues. I would also log results of a scheduled task to a file. Then check the file. If it exists then your scheduled tasks ARE running - they are just not succeeding. Good luck!
I've seen the cf scheduling service crash in CF8. The rest of CF is unaffected.
Have you tried restarting the server?
Here are your concerns:
Your File (works since you tested it manually).
Your Scheduled Task (failed).
Your Coldfusion Application (Service) (any changes here)?
Your Server (what about here).
To test your problem create a duplicate task and schedule it. Leave the other one in place (maybe set your new one to run earlier). Use the same file too. See if it completes.
If it doesn't then you have a larger problem. Since the Coldfusion Server sits atop of the JVM there could be something happening there. Things just don't stop working unless something got corrupted or you got compromised. If you hardened your server by rearranging/renaming the file structure to make it more secure...It would break your task.
So going back: if your test schedule works then determine what is different between the two. Note you have logging capabilities. Logging abilities for CF8
If you are not directly incharge of maintaining this server, then I would recommend asking around and see if there was recent maintenance, if so, what was done to the server?
My ping to the AWS instance is on the level of 50ms and cat'ing files through the ssh takes way less than second, but when I mount directory using sshfs and open it using SublimeText3/Gedit lags are greater than 10 seconds.
1. Is there anything I could do to reduce those lags?
2. Why it works like that?
3. Are some better tools for remote file editing?
My ssh config:
Host myinstance
HostName ********
User ec2-user
IdentityFile ~/idfile
Compression no
Ciphers arcfour
ServerAliveInterval 15
As a first step, I'd suggest adding this line to your settings (Preferences -> Settings-User):
"atomic_save": false
and see if that does the trick. My answer to this question has some more details behind why this works, but basically what Sublime is doing with atomic_save enabled is creating new temp files and deleting the original file, then renaming the temp back to the original's name. This leads to a very significant increase in traffic over the connection, and if the server on the other side of the pipe is a little bit laggy anyway, it can really slow down Sublime.
I run all my Django sites as SCGI daemons. I won't get into the fundamentals of why I do this but this means that when a site is running, there is a set of processes running from the following command:
/websites/website-name/manage.py runfcgi method=threaded host=127.0.0.1 port=3036 protocol=scgi
All is fine until I want to roll out a new release from the VCS (Bazaar in my case). I made an alias script called up that does the following:
alias up='bzr up; killall manage.py'
It is this generic for one simple reason: I'm lazy. I want one command that I can use under any site to update it. I'm logged into the server most of the time anyway, so, I just hop into the root of the right site and call up. Site updates from BZR and restarts.
The first downside of this is it kills all the manage.py processes on the machine. Currently 6 sites and growing rapidly. The second (and potentially worse -- at least for end-users) is it's a severely non-graceful restart. If somebody was uploading an image or doing something else with a long connection time, their request would just die on the vine.
So what I'm looking for is suggestions for a single method that:
Is generic for lazy people like me (eg I can run it from any site root without having to remember which command I need to call; 'up' is perfect in name.
Only kills the current site. I'm only updating the current site, so only this one should die.
Does the restart in a graceful manner. If possible, it should wait until there are no more active connections. I've no idea how feasible this is.
Instead of killing everything with manage.py in the name, could you write a script for each site that only kills manage.py processes from that site? (Edit: just write the scripts and put them in the root of each site (which you cd to anyway) and run those – still only one command to remember)
I don't know enough about SCGI or Bazaar to suggest much more than that... My method (I'm lazy too) uses Mercurial and Fabric for deployment: http://stevelosh.com/blog/entry/2009/1/15/deploying-site-fabric-and-mercurial/ – maybe it will give you an idea you can use?