I have a query regarding bash. I have been running some of my own C++ programs in conjunction with commercial programs and controlling their interaction (via input and output files) through Bash scripting. I am finding that if I run my c++ program alone in terminal it completes in around 10–15 seconds, but when I run the same through the bash script it can take up to 5 minutes to complete in each case.
I find using System Monitor that consistently 100% of one CPU is used when I run the program directly in terminal whereas when I run it in bash (in a loop) a maximum of 60% of CPU usage is recorded and seems to be linked to the longer completion time (although the average CPU usage is higher over the 4 processors).
This is quite frustrating as until recently this was not a problem.
An example of the code:
#!/usr/bin/bash
DIR="$1"
TRCKDIR=$DIR/TRCKRSLTS
STRUCTDIR=$DIR
SHRTTRCKDIR=$TRCKDIR/SHRT_TCK_FILES
VTAL=VTAL.png
VTAR=VTAR.png
NAL=$(find $STRUCTDIR | grep NAL)
NAR=$(find $STRUCTDIR | grep NAR)
AMYL=$(find $STRUCTDIR | grep AMYL)
AMYR=$(find $STRUCTDIR | grep AMYR)
TCKFLS=($(find $TRCKDIR -maxdepth 1 | grep .fls))
numTCKFLS=${#TCKFLS[#]}
for i in $(seq 0 $[numTCKFLS-1]); do
filenme=${TCKFLS[i]}
filenme=${filenme%.t*}
filenme=${filenme##*/}
if [[ "$filenme" == *VTAL* || "$filenme" == *VTA_L* ]]; then
STREAMLINE_CUTTER -MRT ${TCKFLS[i]} -ROI1 $VTAL -ROI2 $NAL -op "$SHRTTRCKDIR"/"$filenme"_VTAL_NAL.fls
STREAMLINE_CUTTER -MRT ${TCKFLS[i]} -ROI1 $VTAL -ROI2 $AMYL -op "$SHRTTRCKDIR"/"$filenme"_VTAL_AMYL.fls
fi
if [[ "$filenme" == *VTAR* || "$filenme" == *VTA_R* ]];then
STREAMLINE_CUTTER -MRT ${TCKFLS[i]} -ROI1 $VTAR -ROI2 $NAR -op "$SHRTTRCKDIR"/"$filenme"_VTAR_NAR.fls
STREAMLINE_CUTTER -MRT ${TCKFLS[i]} -ROI1 $VTAR -ROI2 $AMYR -op "$SHRTTRCKDIR"/"$filenme"_VTAR_AMYR.fls
fi
done
Related
I have a cron job and this cron job is doing something with lots of data and then delete all the temp files it creates. during the execution, I get 'ERROR: Insufficient space in file WORK.AIB_CUSTOMER_DATA.DATA.' the current work directory has 50G free, when I run the code in another directory with 170G free space, I don't get the error, I want to track the size of working directory during the execution.
I'm afraid I might not fully understand your problem.
In order to get an understanding on how fast is it growing in terms of size you could run a simple script like:
#!/bin/bash
while true
do
#uncomment this to check all partitions of the system.
#df -h >> log.log
#uncomment this to check the files in the current folder.
#du -sh * >> log.log
sleep 1
done
Then analyze the logs and see the increase in size.
I wrote this script and let it run during the job execution to monitor the directory size and get the maximum amount of size for this work directory.
#!/bin/bash
Max=0
while true
do
SIZE=`du -sh -B1 /data/work/EXAMPLE_work* | awk '{print $1}' `
echo size: $SIZE
echo max: $Max
if [ "$SIZE" -ge $Max ]
then
echo "big size: $SIZE" > /home/mmm/Biggestsize.txt
Max=$SIZE
else
echo "small size: $SIZE" > /home/mmm/sizeSmall.txt
fi
done
I have an issue with graceful exiting my slurm jobs with saving data, etc.
I have a signal handler in my program which sets a flag, which is then queried in a main loop and a graceful exit with data saving follows. The general scheme is something like this:
#include <utility>
#include <atomic>
#include <fstream>
#include <unistd.h>
namespace {
std::atomic<bool> sigint_received = false;
}
void sigint_handler(int) {
sigint_received = true;
}
int main() {
std::signal(SIGTERM, sigint_handler);
while(true) {
usleep(10); // There are around 100 iterations per second
if (sigint_received)
break;
}
std::ofstream out("result.dat");
if (!out)
return 1;
out << "Here I save the data";
return 0;
}
Batch scripts are unfortunately complicated because:
I want hundreds of parallel, low-thread-count independent tasks, but my cluster allows only 16 jobs per user
srun in my cluster always claims a whole node, even if I don't want all cores, so in order to run multiple processes on a single node I have to use bash
Because of it, batch script is this mess (2 nodes for 4 processes):
#!/bin/bash -l
#SBATCH -N 2
#SBATCH more slurm stuff, such as --time, etc.
srun -N 1 -n 1 bash -c '
./my_program input1 &
./my_program input2 &
wait
' &
srun -N 1 -n 1 bash -c '
./my_program input3 &
./my_program input4 &
wait
' &
wait
Now, to propagate signals sent by slurm, I have even a bigger mess like this (following this answer, in particular double waits):
#!/bin/bash -l
#SBATCH -N 2
#SBATCH more slurm stuff, such as --time, etc.
trap 'kill $(jobs -p) && wait' TERM
srun -N 1 -n 1 bash -c '
trap '"'"'kill $(jobs -p) && wait'"'"' TERM
./my_program input1 &
./my_program input2 &
wait
' &
srun -N 1 -n 1 bash -c '
trap '"'"'kill $(jobs -p) && wait'"'"' TERM
./my_program input3 &
./my_program input4 &
wait
' &
wait
For the most part it is working. But, firstly, I am getting error messeges at the end of output:
run: error: nid00682: task 0: Exited with exit code 143
srun: Terminating job step 732774.7
srun: error: nid00541: task 0: Exited with exit code 143
srun: Terminating job step 732774.4
...
and, what is worse, like 4-6 out of over 300 processes actually fail on if (!out) - errno gives "Interrupted system call". Again, guided by this, I guess that my signal handler is called two times - the second one during some syscall under std::ofstream constructor.
Now,
How to get rid of slurm errors and have an actual graceful exit?
Am I correct that signal is sent two times? If so, why, and how can I fix it?
Suggestions:
trap EXIT, not a signal. EXIT happens once, TERM can be delivered multiple times.
use declare -f to transfer code and declare -p to transfer variables to an unrelated subshell
kill can fail, I do not think you should && on it
use xargs (or parallel) instead of reinventing the wheel with kill $(jobs -p)
extract "data" (input1 input2 ...) from "code" (work to be done)
Something along:
# The input.
input="$(cat <<'EOF'
input1
input2
input3
input4
EOF
)"
work() {
# Normally write work to be done.
# For each argument, run `my_program` in parallel.
printf "%s\n" "$#" | xargs -d'\n' -P0 ./my_program
}
# For each two arguments run `srun....` with a shell that runs `work` in parallel.
# Note - declare -f outputs source-able definition of the function.
# "No more hand escaping!"
# Then the work function is called with arguments passed by xargs inside the spawned shell.
xargs -P0 -n2 -d'\n' <<<"$input" \
srun -N 1 -n 1 \
bash -c "$(declare -f work)"'; work "$#"' --
The -P0 is specific to GNU xargs. GNU xargs specially handles exit status 255, you can write a wrapper like xargs ... bash -c './my_program "$#" || exit 255' -- || exit 255 if you want xargs to terminate if any of programs fail.
If srun preserves environment variables, then export work function export -f work and just call it within child shell like xargs ... srun ... bash -c 'work "$#"' --.
I am wanting to run a program in the background that collects some performance data and then run an application in the foreground. When the foreground application finishes it detects this and the closes the application in the background. The issue is that when the background application closes without first closing the file, I'm assuming, the output of the file remains empty. Is there a way to constantly write the output file so that if the background application unexpectedly closes the output is preserved?
Here is my shell script:
./background_application -o=output.csv &
background_pid=$!
./foreground_application
ps -a | grep foreground_application
if pgrep foreground_application > /dev/null
then
result=1
else
result=0
fi
while [ result -ne 0 ]
do
if pgrep RPx > /dev/null
then
result=1
else
result=0
fi
sleep 10
done
kill $background_pid
echo "Finished"
I have access to the source code for the background application written in C++ it is a basic loop and runs fflush(outputfile) every loop iteration.
This would be shorter:
./background_application -o=output.csv &
background_pid=$!
./foreground_application
cp output.csv output_last_look.csv
kill $background_pid
echo "Finished"
I made a performance test with really surprising result: perl is more than 20 times faster!
Is this normal?
Does it result from my regular expression?
is egrep far slower than grep?
... i tested on a current cygwin and a current OpenSuSE 13.1 in virtualbox.
Fastest Test with perl:
time zcat log.gz \
| perl -ne 'print if ($_ =~ /^\S+\s+\S+\s+(ERROR|WARNING|SEVERE)\s/ )'
| tail
2014-06-24 14:51:43,929 SEVERE ajp-0.0.0.0-8009-13 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 14:52:01,031 ERROR HFN SI ThreadPool(4)-442 CepEventUnmarshaler Unmarshaled Events Duration: 111
2014-06-24 14:52:03,556 ERROR HFN SI ThreadPool(4)-444 CepEventUnmarshaler Unmarshaled Events Duration: 52
2014-06-24 14:52:06,789 SEVERE ajp-0.0.0.0-8009-1 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 14:52:06,792 SEVERE ajp-0.0.0.0-8009-1 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 14:52:07,371 SEVERE ajp-0.0.0.0-8009-9 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 14:52:07,373 SEVERE ajp-0.0.0.0-8009-9 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 14:52:07,780 SEVERE ajp-0.0.0.0-8009-11 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 14:52:07,782 SEVERE ajp-0.0.0.0-8009-11 SessionDataUpdateManager cannot register active data when window has no name
2014-06-24 15:06:24,119 ERROR HFN SI ThreadPool(4)-443 CepEventUnmarshaler Unmarshaled Events Duration: 117
real 0m0.151s
user 0m0.062s
sys 0m0.139s
fine!
far slower test with egrep:
time zcat log.gz \
| egrep '^\S+\s+\S+\s+(ERROR|WARNING|SEVERE)\s'
| tail
...
real 0m2.454s
user 0m2.448s
sys 0m0.092s
(Output was same as above...)
finally even slower grep with different notation (my first try)
time zcat log.gz \
| egrep '^[^\s]+\s+[^\s]+\s+(ERROR|WARNING|SEVERE)\s'
| tail
...
real 0m4.295s
user 0m4.272s
sys 0m0.138s
(Output was same as above...)
The ungzipped file size is about 2.000.000 lines an un-gzip-ped 500MBytes - matching line count is very small.
my tested versions:
OpenSuSE with grep (GNU grep) 2.14
cygwin with grep (GNU grep) 2.16
perhaps some Bug with newer grep versions?
You should be able to make the Perl a little bit faster by making the parentheses non-capturing:
(?:ERROR|WARNING|SEVERE)
Also, it's unnecessary to match against $_. $_ is assumed if there is nothing specified. That's why it exists.
perl -ne 'print if /^\S+\s+\S+\s+(?:ERROR|WARNING|SEVERE)\s/'
You get tricked by your operating system's cache. When reading and grepping files some layers to the filesystem get walked through:
Harddrive own cache
OS read cache
To really know what's going on it's a good idea to warm up these caches by running some tests which do their work but do not count. After these test stop your runnning time.
As Chris Hamel commented.... not using the "|" atom grep becomes faster about 10+ times - still slower than perl.
time zcat log.gz \
| egrep '^[^\s]+\s+[^\s]+\s+(ERROR)\s'
| tail
...
real 0m0.216s
user 0m0.062s
sys 0m0.123s
So with 2 "|" atoms the grep run gets more than 3 times slower than running three greps after each other - sounds like a Bug for me... any earlier grep version to test around? i have a reHat5 also... grep seems similar slow there...
I need to know the average volume of an mp3 file so that when I convert it to mp3 (at a different bitrate) I can scale the volume too, to normalize it...
Therefore I need a command line tool / ruby library that gives me the average volume in dB.
You can use sox (an open source command line audio tool http://sox.sourceforge.net/sox.html) to normalize and transcode your files at the same time.
EDIT
Looks like it doesn't have options for bit-rate. Anyway, sox is probably overkill if LAME does normalization.
You can use LAME to encode to mp3. It has options for normalization, scaling, and bitrate. LAME also compiles to virtually any platform.
I wrote a little wrapper script, based on the above input:
#!/bin/sh
# Get the current volume (will reset to this later).
current=`amixer -c 0 get Master 2>&1 |\
awk '/%/ {
p=substr($4,2,length($4)-2);
if( substr(p,length(p)) == "%" )
{
p = substr(p,1,length(p)-1)
}
print p
}'`
# Figure out how loud the track is. The normal amplitude for a track is 0.1.
# Ludicrously low values are 0.05, high is 0.37 (!!?)
rm -f /tmp/$$.out
/usr/bin/mplayer -vo null -ao pcm:file=/tmp/$$.out $1 >/dev/null 2>&1
if [ $? = 0 ] ; then
amplitude=`/usr/bin/sox /tmp/$$.out -n stat 2>&1 | awk '/RMS.+amplitude/ {print $NF}'`
fi
rm -f /tmp/$$.out
# Set an appropriate volume for the track.
to=`echo $current $amplitude | awk '{printf( "%.0f%%", $1 * 0.1/$2 );}'`
echo $current $amplitude | awk '{print "Amplitude:", $2, " Setting volume to:", 10/$2 "%, mixer volume:", $1 * 0.1/$2}'
amixer -c 0 set Master $to >/dev/null 2>&1
mplayer -quiet -cache 2500 $1
# Reset the volume for next time.
amixer -c 0 set Master "$current%" >/dev/null 2>&1
It takes an extra second to start up playing the file, and relies on alsamixer to adjust the volume, but it does a really nice job of keeping you from having to constantly tweak the master volume. And it doesn't really care what the input format is, since if mplayer can play it at all, it can extract the audio, so it should work fine with MP3, Ogg, AVI, whatever.
http://mp3gain.sourceforge.net/ is a well thought out solution for this.