How to change the storage path from rrdtool on the Ganglia? - rrdtool

How to change the storage path from rrdtool on the Ganglia?
For example: I have a default configuration in my file gmetad.conf, but I want to change to other storage. how to alter this path?
Where gmetad stores its round-robin databases
default: "/var/lib/ganglia/rrds"
rrd_rootdir "/some/other/place"
I tried to change the rrd_rootdir, but doesn't work.
Thanks
Namir Rachid

Well, you forgot few things. But I will elaborate with more details, but before that, you may need to stop gmetad daemon first:
Step 1: Create directory where you want to store rrdtool based data of ganglia
[root#ganglia-server ganglia-3.6.0]# mkdir -p /some/other/place/
Step 2: Make ganglia as the owner of this directory.
[root#ganglia-server ganglia-3.6.0]# chown -R ganglia /some/other/place/
Step 3: Provide appropriate permission also. You may test it otherwise.
[root#ganglia-server ganglia-3.6.0]# chmod -R 777 /some/other/place/
Step 4: Enable /some/other/place in gmetad.conf. Don't forget to remove pound symbol.
# Where gmetad stores its round-robin databases
# default: "/var/lib/ganglia/rrds"
rrd_rootdir "/some/other/place"
# rrd_rootdir "/some/other/place"
Step 5: Test if data is being written in /some/other/place in your gmetad log.
[root#ganglia-server ganglia-3.6.0]# gmetad/gmetad -d 5 -c /etc/ganglia/gmetad.conf
Going to run as user ganglia
Sources are ...
Source: [my cluster, step 15] has 1 sources
127.0.0.1
xml listening on port 8651
interactive xml listening on port 8652
.......
.......
Updating host ganglia-server, metric cpu_steal
Created rrd /some/other/place/default/ganglia-server/cpu_steal.rrd
Updated rrd /some/other/place/default/ganglia-server/cpu_steal.rrd with value 1414567960:0.0
Updating host ganglia-server, metric load_one
Created rrd /some/other/place/default/ganglia-server/load_one.rrd
Updated rrd /some/other/place/default/ganglia-server/load_one.rrd with value 1414567960:0.01
Note: The gmetad executable may be at different location on your machine. Change the location as required to generate the log. In most of the cases, gmetad daemon is installed in "/usr/local/sbin/gmetad".
Step 6: Start the gmetad daemon now.
It worked for me. And, hopefully, it should work for you too.

Related

Elastic BeanStalk EC2 instance's log uses up whole disk space

I have an Elastic BeanStalk environment where I run my application on 1 EC2 instance. I've added load balancer, when I configured the environment initially, but since then I set it only use 1 instance.
Application run within container apparently produces quite a lot of logs - after several days they use up whole disk space and then application crash. Health check drops to severe.
I see that terminating instance manually helps - environment removes old instance and creates a new one that works (until it fills up the whole disk again).
What are my options? A script that regularly cleans up logs? Some log rotation? Trigger that reboots instance when disk is nearly full?
I do not write anything to file myself - my application only log to std out and std err, so writing to file is done by EC2/EBS wrapper. (I deploy the application as a ZIP containing a JAR, a bash script and Procfile if that is relevant).
By default EB will rotate some of the logs produced by the Docker containers, but not all of them. After contacting support on this issue I received the following helpful config file, to be placed in the source path .ebextensions/liblogrotate.config:
files:
"/etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.containers.conf":
mode: "00644"
owner: "root"
group: "root"
content: |
/var/lib/docker/containers/*/*.log {
size 10M
rotate 5
missingok
compress
notifempty
copytruncate
dateext
dateformat %s
olddir /var/lib/docker/containers/rotated
}
"/etc/cron.hourly/cron.logrotate.elasticbeanstalk.containers.conf":
mode: "00755"
owner: "root"
group: "root"
content: |
#!/bin/sh
test -x /usr/sbin/logrotate || exit 0
/usr/sbin/logrotate /etc/logrotate.elasticbeanstalk.hourly/logrotate.elasticbeanstalk.containers.conf
container_commands:
create_rotated_dir:
command: mkdir -p /var/lib/docker/containers/rotated
test: test ! -d /var/lib/docker/containers/rotated
99_cleanup:
command: rm /etc/cron.hourly/*.bak /etc/logrotate.elasticbeanstalk.hourly/*.bak
ignoreErrors: true
What this does is install an additional log rotation configuration and cron task for the /var/lib/docker/containers/*/*.log files which are the ones not automatically rotated on EB.
Eventually, however, the rotated logs themselves will fill up the disk if the host lives long enough. For this, you can add shred in the list of logrotation options (along side compress notifempty etc).
(However, I'm not sure if the container logs that are already configured for rotation are set to be shredded, probably not - so those may accumulate too and require modification of the default EB log rotation config. Not sure how to do that yet. But the above solution in most cases would be sufficient since hosts typically do not live that long. The volume of logging and lifetime of your containers may force you to go even further.)
Logrotation is the way forward. You can create a configuration file in `/etc/logrotate.d/' where you state your options in order to avoid having large log files.
You can read more about the configurations here https://linuxconfig.org/setting-up-logrotate-on-redhat-linux
A sample configuration file would look something like this:
/var/log/your-large-log.log {
missingok
notifempty
compress
size 20k
daily
create 0600 root root
}
You can also test the new configuration file from the cli by running the follow:
logrotate -d [your_config_file]
This will test if the log rotation will be successful or not but only in debugging mode, therefore the log file will not be actually rotated.

Cannot chmod file on Openshift online v3 : Operation not permitted

I am migrating a Django application from Openshift v2 to v3 (In case you don't know, RedHat is shutting down v2 on September 30th, see: https://blog.openshift.com/migrate-to-v3-v2-eol/)
So, I am following this blog post to help me: https://blog.openshift.com/migrating-django-applications-openshift-3/ . I am new to all these Docker / Kubernetes concepts the new version is build upon.
I was able to make some progress : I managed to get a successful build of my app. Yet it crashes at deployment time:
---> Running application from script (app.sh) ...
/usr/libexec/s2i/run: line 42: /opt/app-root/src/app.sh: Permission denied
Indeed, app.sh has lost its x permission. I log into the failing container as debug and see it:
> oc debug dc/<my app>
> (app-root)sh-4.2$ ls -l /opt/app-root/src/app.sh
-rw-rw-r--. 1 default root 127 Sep 6 21:20 /opt/app-root/src/app.sh
The blog posts states "Ensure that the app.sh file is executable by running chmod +x app.sh.", which I did on my local repo. Whatever, I want to do it again directly in the pod, but it doesn't work:
(app-root)sh-4.2$ chmod +x /opt/app-root/src/app.sh
chmod: changing permissions of ‘/opt/app-root/src/app.sh’: Operation not permitted
So, how can I set the x permission to app.sh ? Thank you
Without looking into more details, any S2I builder image will gladly use your custom supplied run script to start the application in an alternative way.
Create .s2i/bin/ (mind the dot) in your source code directory, place the run script into it and rebuild the app in OpenShift - it will automatically use your custom run script upon deployment.
This is the preferred way of starting applications using custom commands in OpenShift.
Regarding your immediate problem, there is a very simple reason why you can not change the permissions of the script: you were trying to modify the permissions in the deployed pod, and not the builder pod. Deployed pods run using different UIDs, usually somewhere in the range of 100000000, and definitely do not match the file ownership as generated by the build. Hence permission denied.
The root cause of your problem (app.sh losing executable permissions) must be in the way the build process installs those files, and indeed looking at the /usr/libexec/s2i/assemble script in the base image does seem to reveal the culprit. The last two lines are:
# set permissions for any installed artifacts
fix-permissions /opt/app-root
If you wanted to change this part of the build instead of using a custom run script, I suggest you then create .s2i/bin/assemble in your project's source code and make it look sort of like this:
#!/bin/bash
echo "Running stock build:"
${STI_SCRIPTS_PATH}/assemble
echo "Fixing the mess:"
chmod 755 /opt/app-root/src/app.sh
This will fix whatever the stock build process does to file permissions, and will do it using the same UID as the rest of the build, so file ownership shouldn't be an issue.
as I stumbled upon this issue myself I've found a way to resolve it.
You have to make your file app.sh executable and push it in your repo as such.
If git does not track this modification as it did for me, you have to use: git update-index --chmod=+x app.sh for it to work.

Redis telling me "Failed opening .rdb for saving: Permission denied"

I'm running Redis server 2.8.17 on a Debian server 8.5. I'm using Redis as a session store for a Django 1.8.4 application.
I haven't changed the software configuration on my server for a couple of months and everything was working just fine until a week ago when Django began raising the following error:
MISCONF Redis is configured to save RDB snapshots but is currently not able to persist to disk. Commands that may modify the data set are disabled. Please check Redis logs for details...
I checked the redis log and saw this happening about once a second:
1 changes in 900 seconds. Saving...
Background saving started by pid 22213
Failed opening .rdb for saving: Permission denied
Background saving error
I've read these two SO questions 1, 2 but they haven't helped me find the problem.
ps shows that user "redis" is running the server:
redis 26769 ... /usr/bin/redis-server *.6379
I checked my config file for the redis file name and path:
grep ^dir /etc/redis/redis.conf =>
dir /var/lib/redis
grep ^dbfilename /etc =>
dbfilename dump.rdb
The permissons on /var/lib/redis are 755 and it's owned by redis:redis.
The permissons on /var/lib/redis/dump.rdb are 644 and it's owned by redis:redis too.
I also ran strace on the server process:
ps -C redis-server # pid = 26769
sudo strace -p 26769 -o /tmp/strace.out
But when I examine the output, I don't see any errors. In particular I don't see a "Permission denied" error as I would expect.
Also, /var/lib/redis is not an NFS directory.
Does anyone know what else could be causing this? I'd hate to have to stop using Redis. I know I can run the command "set stop-writes-on-bgsave-error yes" but that doesn't solve the problem.
This is now happening on a daily basis and the only way I can stop the error is to restart the Redis server.
Thanks.
I just had a similar issue. Despite my config file being correct, when I checked the actual dbfilename and dir in redis-client, they were incorrect.
Run redis-cli and then
CONFIG GET dbfilenamewhich should return something like
1) "dbfilename"
2) "dump.rdb"
1) is just the key and 2) the value. Similarly then run CONFIG GET dir should return something like
1) "dir"
2) "/var/lib/redis"
Confirm that these are correct and if not, set them with CONFIG SET dir /correct/path
Hope this helps!
If you have moved Redis to a new mounted volume: /mnt/data-01.
sudo vim /etc/systemd/system/redis.service
Set ReadWriteDirectories=-/mnt/data-01
sudo mkdir /mnt/data-01/redis
Set chown and chmod on new redis data dir and rdb file.
The permissons on /var/lib/redis are 755 and it's owned by redis:redis
The permissons on /var/lib/redis/dump.rdb are 644 and it's owned by redis:redis
Switch configurations while redis is running
$ redis-cli
127.0.0.1:6379> CONFIG SET dir /data/tmp
redis-cli 127.0.0.1:6379> CONFIG SET dbfilename temp.rdb
127.0.0.1:6379> BGSAVE
tail /var/log/redis/redis.cnf (verify saved)
Start Redis Server in a directory where Redis has write permissions
The answers above will definitely solve your problem, but here's what's actually going on:
The default location for storing the rdb.dump file is ./ (denoting current directory). You can verify this in your redis.conf file. Therefore, the directory from where you start the redis server is where a dump.rdb file will be created and updated.
Since you say your redis server has been working fine for a while and this just started happening, it seems you have started running the redis server in a directory where redis does not have the correct permissions to create the dump.rdb file.
To make matters worse, redis will also probably not allow you to shut down the server either until it is able to create the rdb file to ensure the proper saving of data.
To solve this problem, you must go into the active redis client environment using redis-cli and update the dir key and set its value to your project folder or any folder where non-root has permissions to save. Then run BGSAVE to invoke the creation of the dump.rdb file.
CONFIG SET dir "/hardcoded/path/to/your/project/folder"
BGSAVE
(Now, if you need to save the dump.rdb file in the directory that you started the server in, then you will need to change permissions for the directory so that redis can write to it. You can search stackoverflow for how to do that).
You should now be able to shut down the redis server. Note that we hardcoded the path. Hardcoding is rarely a good practice and I highly recommend starting the redis server from your project directory and changing the dir key back to./`.
CONFIG SET dir "./"
BGSAVE
That way when you need redis for another project, the dump file will be created in your current project's directory and not in the hardcoded path's project directory.
You can resolve this problem by going into the redis-cli
Type redis-cli in the terminal
Then write config set stop-writes-on-bgsave-error no and it resolved my problem.
Hope it resolved your problem
Up to redis 3.2 it shipped with pretty insane defaults which opened the port to the public. In combination with the CONFIG SET instruction everybody can change your redis config from outside easily. If the error starts after some time, someone probably changed your config.
On your local machine check that
telnet SERVER_IP REDIS_PORT
is denied. Otherwise check your config, you should have the setting
bind 127.0.0.1
enabled.
Dependent on the user that runs redis, you should also check for damage that the intruder has done.

Running celery as daemon does not create PID file

I have been scratching my brains on this one since past few days, I have seen other issues on stackoverflow (as it is a duplicate question) and I have tried everything to make this work, the workers are running fine but the celery is not starting up as a process.
I run the command:
sudo service celeryd start
and I get:
celery init v10.1.
Using config script: /etc/default/celeryd
celery multi v3.1.23 (Cipater)
> Starting nodes...
> worker1#ip-172-31-21-215: OK
I run:
sudo service celeryd status
and I get:
celery init v10.1.
Using config script: /etc/default/celeryd
celeryd down: no pidfiles found
The celeryd down: no pidfiles found error is what I need to resolve.
I know this question is a duplicate one but still go along with me on this one because I have tried all of them and still unable to get it resolved.
I am deploying this script on Amazon Web Services. I am using a virtual environment.
The init.d script is taken directly from the here and then I gave it the required permissions.
Here is my configuration file:
# Names of nodes to start
# most people will only start one node:
CELERYD_NODES="worker1"
# but you can also start multiple and configure settings
# for each in CELERYD_OPTS (see `celery multi --help` for examples):
#CELERYD_NODES="worker1 worker2 worker3"
# alternatively, you can specify the number of nodes to start:
#CELERYD_NODES=10
# Absolute or relative path to the 'celery' command:
# CELERY_BIN="/usr/local/bin/celery"
CELERY_BIN="/home/<user>/.virtualenvs/<virtualenv_name>/bin/celery"
# App instance to use
# comment out this line if you don't use an app
# CELERY_APP="proj"
# or fully qualified:
CELERY_APP="<project_name>.settings:app"
# Where to chdir at start.
CELERYD_CHDIR="/home/<user>/projects/<project_name>/"
# Extra command-line arguments to the worker
CELERYD_OPTS="--time-limit=300 --concurrency=8"
# %N will be replaced with the first part of the nodename.
CELERYD_LOG_FILE="/var/log/celery/%N.log"
CELERYD_PID_FILE="/var/run/celery/%N.pid"
# Workers should run as an unprivileged user.
# You need to create this user manually (or you can choose
# a user/group combination that already exists, e.g. nobody).
CELERYD_USER="celery"
CELERYD_GROUP="celery"
# If enabled pid and log directories will be created if missing,
# and owned by the userid/group configured.
CELERY_CREATE_DIRS=1
I have used the process to create the celery user using this article.
My project is a Django project and I have specified the DJANGO_SETTINGS_MODULE environment variable in the celery setting file as specified in the documentation and also in the stackoverflow answer.
Do I need to change anything in the init.d script or anything else that needs to be added in the celery configuration file... Is it about the celery user that I have created because I also tried specifying
CELERYD_USER = ""
CELERYD_GROUP = ""
while also changing the DEFAULT_USER value to "" in the init.d script.
Still the issue persisted.
In one of the answers it was also told that there might be some errors in the project... but I did not find any such errors all thanks to my test cases.
PS : I have specified , and for privacy issues
they have their original names.
I was having this a similar issue on my ubuntu server [ERROR 2]FILE NOT FOUND. Turns out, /var/run/celery/ Directories don't get automatically created even if you set that in the celery.service configuration done in the celery example docs. You can make that directory, and grant the right permissions manually, but as soon you reboot the server the directory will vanish because it's in a temporary directory.
After some reading about how the linux system operates, I found out you just need to create a configuration file in /etc/tmpfiles.d/celery.conf with these lines
d /var/run/celery 0755 admin admin -
d /var/log/celery 0755 admin admin -
Note: you will need to use a different user:group other than 'admin' or you can create a user:group called admin specifically to handle your celery process.
You can read more about this configuration and the way it operates by typing
man tmpfiles.d
I had the issue and solved it just now, thank god! For me it was a permission issue. I had expected it to be in /var/run/celery or /var/log/celery but it turns out it was the log file I have setup Django logging for. For some reason celery wanted to write to that file (I have to look into that) but had no permission. I found the error with the verbose command and skip daemonization step:
# C_FAKEFORK=1 sh -x /etc/init.d/celeryd start
This is an old thread but if anyone of you run into this error, I hope this may help!
Good luck!
I saw the same issue and it turned out to be a permissions issue.
Make sure to set the user/group that celery is running under to own the /var/log/celery/ and /var/run/celery/ folders.
See here for a step by step example:
Daemonizing celery

Not able to Start/Stop Spark Worker from Remote Machine

I have two machines A and B. I am trying to run Spark Master on machine A and Spark Worker on machine B.
I have set machine B's host name in conf/slaves in my Spark directory.
When I am executing start-all.sh to start master and workers, I am getting below message on console:
abc#abc-vostro:~/spark-scala-2.10$ sudo sh bin/start-all.sh
sudo: /etc/sudoers.d is world writable
starting spark.deploy.master.Master, logging to /home/abc/spark-scala-2.10/bin/../logs/spark-root-spark.deploy.master.Master-1-abc-vostro.out
13/09/11 14:54:29 WARN spark.Utils: Your hostname, abc-vostro resolves to a loopback address: 127.0.1.1; using 1XY.1XY.Y.Y instead (on interface wlan2)
13/09/11 14:54:29 WARN spark.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Master IP: abc-vostro
cd /home/abc/spark-scala-2.10/bin/.. ; /home/abc/spark-scala-2.10/bin/start-slave.sh 1 spark://abc-vostro:7077
xyz#1XX.1XX.X.X's password:
xyz#1XX.1XX.X.X: bash: line 0: cd: /home/abc/spark-scala-2.10/bin/..: No such file or directory
xyz#1XX.1XX.X.X: bash: /home/abc/spark-scala-2.10/bin/start-slave.sh: No such file or directory
Master is started but worker is failed to start.
I have set xyz#1XX.1XX.X.X in conf/slaves in my Spark directory.
Can anyone help me to resolve this? This is probably something I'm missing any configuration on my end.
However when I create Spark Master and Worker on same machine, It is working fine.
Have you copied all Spark's files at the worker too? Also you need to setup password less access b/w master and worker.
Here were steps I would follow,
Setting up public key authentication over SSH
Checking /etc/spark/conf.dist/spark-env.sh
scp this to your computer B from computer A (master)
Set conf/slaves, hostname for computer B
./start-all.sh
For standalone cluster mode, you may set these option in spark-env.sh.
For example,
export SPARK_WORKER_CORES=2
export SPARK_WORKER_INSTANCES=1
export SPARK_WORKER_MEMORY=4G
see SSH ACCESS, in hadoop multinode cluster setup by michael. just like that .... will solve ur probs..
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/