running parallel code on PC - fortran

I have fortran code that has been parallelized with OpenMP. I want to test my code on my PC before running on HPC. My PC has double core CPU and I work on Linux-mint. I installed gfortranmultilib and this is my script:
#!/bin/bash
### Job name
#PBS -N pme
### Keep Output and Error
#PBS -j eo
### Specify the number of nodes and thread (ppn) for your job.
#PBS -l nodes=1:ppn=2
### Switch to the working directory;
cd $PBS_O_WORKDIR
### Run:
OMP_NUM_THREADS=$PBS_NUM_PPN
export OMP_NUM_THREADS
ulimit -s unlimited
./a.out
echo 'done'
What should I do more to run my code?
OK, I changed script as suggested in answers:
#!/bin/bash
### Switch to the working directory;
cd Desktop/test
### Run:
OMP_NUM_THREADS=2
export OMP_NUM_THREADS
ulimit -s unlimited
./a.out
echo 'done'
my code and its executable file are in folder test on Desktop, so:
cd Desktop/test
is this correct?
then I compile my simple code:
implicit none
!$OMP PARALLEL
write(6,*)'hi'
!$OMP END PARALLEL
end
by command:
gfortran -fopenmp test.f
and then run by:
./a.out
but only one "hi" is printed as output. What should I do?
(and a question about this site: in situation like this I should edit my post or just add a comment?)

You don't need and probably don't want to use the script on your PC. Not even to learn how to use such a script, because these scripts are too much connected to the specifics of each supercomputer.
I use several supercomputers/clusters and I cannot just reuse the script from one at the other, because they are so much different.
On your PC you should just do:
optional, it is probably the default
export OMP_NUM_THREADS=2
to set the number of OpenMP threads to 2. Adjust if you need some other number.
cd to the working directory
cd my_working_directory
Your working directory is the directory where you have the required data or where the executable resides. In your case it seems to be the directory where a.out is.
run the damn thing
ulimit -s unlimited
./a.out
That's it.
You can also store the standard output and error output to a file
./out > out.txt 2> err.txt
to mimic the supercomputer behaviour.
The PBS variables are only set when you run the script using qsub. You probably don't have that on your PC and you probably don't want to have it either.
$PBS_O_WORKDIR is the directory where you run the qsub command, unless you set it differently by other means.
$PBS_NUM_PPN is the number you indicated in #PBS -l nodes=1:ppn=2. The queue system reads that and sets this variable for you.

The script you posted is for Portable Batch System (https://en.wikipedia.org/wiki/Portable_Batch_System) queue system. That means, that the job you want to run on the HPC infrastructure has to go first into the queue system and when the resources are available the job will run on the system.
Some of the commands (those starting with #PBS) are specific commands for this queue system. Among these commands, some allow the user to indicate the application process hierarchy (i.e. number of processes and threads). Also, keep in mind that since all the PBS commands start by # they are ignored by regular shell script execution. In the case you presented, that is given by
### Specify the number of nodes and thread (ppn) for your job.
#PBS -l nodes=1:ppn=2
which as the comment indicates it should tell the queue system that you want to run 1 process and each process will have 2 threads. The queue system is likely to pass these parameters to the process launcher (srun/mpirun/aprun/... for MPI apps in addition to OMP_NUM_THREADS for OpenMP apps).
If you want to run this job on a computer that does not have PBS queue, you should be aware at least of two things.
1) The following command
### Switch to the working directory;
cd $PBS_O_WORKDIR
will be translated into "cd" because the environment variable PBS_O_WORKDIR is only defined within the PBS job context. So, you should change this command (or execute another cd command just before the execution) in order to fix where you want to run the job.
2) Similarly for PBS_NUM_PPN environment variable,
OMP_NUM_THREADS=$PBS_NUM_PPN
export OMP_NUM_THREADS
this variable won't be defined if you don't run this within a PBS job context, so you should set OMP_NUM_THREADS to the value you want (2, according to your question) manually.

If you want your linux box environment to be like an HPC login node. You can do the following
Make sure that your compiler supports OpenMP, test a simple hello world program with OpenMP flags
Install OpenMPI on your system from your favourite package manager or download the source/binary from the website (OpenMPI Download)
I would not recommend installing cluster manager like Slurm for your experiments
After you are done, you can execute your MPI programs through the mpirun wrapper
mpirun -n <no_of_cores> <executable>
EDIT:
This is assuming that you are running this only MPI. Note that OpenMP utilizes the cores as well. If you are running MPI+OpenMP - n*OMP_NUM_THREADS=cores on a single node.

Related

How to use gnu_parallel to run multiple executable and/or bash scripts?

I've been recently attempting to run my scripts in parallel in a more convenient way than to open a several instances of terminal and executing in scripts separately.
I've been trying to learn how to use gnu_parallel for the past couple of days and I am still a bit clueless, and hoping if someone can provide a direct example.
Suppose I have a g++ compiled code called blah.exe and a bash script called blah.sh that will run alone perfectly fine, but I want to execute them in different directories.
I've been reading
https://www.gnu.org/software/parallel/man.html#EXAMPLE:-Working-as-xargs--n1.-Argument-appending
and
https://www.biostars.org/p/182136/
but I am not totally clear about the syntax
To run these in series, I would do:
for i in 1 2 3 4
mv ./blah.exe directory$i
cd directory$i
./blah.exe all
cd ..
end
similarly
for i in 1 2 3 4
mv ./blah.sh directory$i
cd directory$i
source ./blah.sh all
cd ..
end
I am trying to under stand how I would split this load to 4 logical-threads in one command using parallel.
Could someone provide an example for this?
Thank you for your time.
Something like:
parallel --dry-run 'cd directory{}; ../blah.exe all; source ../blah.sh all' ::: {1..4}
No need to copy/move the executable, just run the same one.
No need to cd .. afterwards, as it's a new process each time.
Note this is not multi-threading, it is multi-processing.
If you want to process discontiguous directory numbers, you can use:
parallel ... ::: {1..4} 6 7 {11..14}
If you want to process all directories, you can use:
printf "%s\0" */ | parallel -0 'cd {}; pwd'
If you want to process all directories starting with FRED, you can use:
printf "%s\0" FRED*/ | parallel -0 'cd {}; pwd'

autostart webserver and programm

I'm working on a Yocto based system. My problem is that I can't start my programm written in C++ and the webserver (node.js) at the same time right after the boot of my device.
I already tried this in /etc/init.d:
#! /bin/bash
/home/ProjectFolder/myProject
cd /home/myapp && DEBUG=myapp:* npm start
exit 0
I changed the rights after creating the script by
chmod +x ./startProg.sh
After that I linked it by
update-rc.d startProg.sh defaults
After reboot the system only starts the C++-programm. I tried some other possibilities like seperating the two comands in different shell-scripts, but that didn't work out any better.
Is there any option I missed or did I make any mistake trying to put those two processes into the autostart?
This of course isn't a C++ or Node.js question. A shell script is a list of commands that are executed in order, unless specified otherwise. So your shell script runs your two programs in the order specified, first myProject and when that's done npm will be started.
This is the same as what would happen from the prompt and the solution is the same: /home/ProjectFolder/myProject &

my system V init script don't return

This is script content, located in /etc/init.d/myserviced:
#!/lib/init/init-d-script
DAEMON="/usr/local/bin/myprogram.py"
NAME="myserviced"
DESC="The description of my service"
When I start the service (either by calling it directly or by calling sudo service myserviced start), I can see program myprogram.py run, but it did not return to command prompt.
I guess there must be something that I misunderstood, so what is it?
The system is Debian, running on a Raspberry Pi.
After more works, I finally solved this issue. There are 2 major reasons:
init-d-script actually calls start-stop-daemon, who don't work well with scripts specified via --exec option. When killing scripts, you should only specify --name option. However, as init-d-script always fill --exec option, it cannot be used with script daemons. I have to write the sysv script by myself.
start-stop-daemon won't magically daemonize the thing you provide. So the executable provided to start-stop-daemon should be daemonized itself, but not a regular program.

How do I make a number of looping scripts execute at startup?

I have a few Python scripts, all of them involving while True: and a wait timer so they run at varying intervals. They do things like monitor a serial port and look for new versions of my code on a remote server. I haven't used cron because some require offsets (e.g run at ten seconds past the minute) and I wanted to keep things very simple.
Using rc.local, I run hook.py on startup. What can I put in hook.py to run a.py, b.py and c.py simultaneously and continuously? I tried subprocess (with shell = True) but I'm not sure the next line / next subprocess command will execute until the first one finishes - which will never happen. Plus it has some weird behaviour I'm struggling to debug (I can rw files using their absolute paths if I run the script directly; when subprocess runs them, it can't find the files).
Any suggestions? Just want something simple that can simultaneously execute several new python scripts. Platform is a Raspberry Pi.
Alternatively: if there's code I can put in rc.local that will spawn a new python process for all .py files in a specified directory, that would work too.
This sounds like it would be better suited for spawning via cron instead of an infinite while loops.
But if you want to continue running them in rc.local just put the & at the end of your command:
/usr/bin/python /home/you/command.py &
This runs the command in the background.
If you want to run all Python files in a given directory I would write a bash script like:
for file in /home/you/*.py
do
if [ "$?" == "0" ]
then
/usr/bin/python "$file" &
fi
done
We will need more information about your path issues to tell you more.

How to work in batch mode

I have inherited an ANSI C++ program that: has no GUIs and is supposed to run in batch mode, generating lots of data (we are talking 100,000+ ASCII files). We are thinking that in long term we’ll run it under UNIX. For now, I have a MacBook Air running OS X 10.9.4 and I loaded Xcode 5.1.1. It compiles without errors or warnings.
I need to test a program as follows:
<prompt> myprogram datain dataout1 datout2
Where is the compiled program? In which directory? Can I copy my datain file in that directory?
For repeated execution under Windows (Command Prompt window) I normally would have a batch file of the type:
myprogram datain1 dataout11 datout12
myprogram datain2 dataout21 datout22
myprogram datain3 dataout31 datout32
........
myprogram datainn dataoutn1 datoutn2
Can I do something similar with OS X? Where can I find the applicable documentation?
You will want to look for your terminal emulation program. See http://en.wikipedia.org/wiki/Terminal_(OS_X) for how to use it, and it should be the bash shell which is one of the unix shells
You can also do a shell script see
http://tldp.org/HOWTO/Bash-Prog-Intro-HOWTO.html for some bash shell scripting info
For such a simple operation, you can write a shell script that will look almost exactly the same as the batch file you use on Windows. The key difference between Windows' cmd.exe and *nix shells here is that the current directory is not part of the search path for executables (the way it is on Windows), so if you put the shell script in the same folder as the compiled executable, you will need to prefix the program name with ./ (to mean "look in the current directory"). For example:
#!/bin/sh
./myprogram datain1 dataout11 datout12
./myprogram datain2 dataout21 datout22
./myprogram datain3 dataout31 datout32
........
./myprogram datainn dataoutn1 datoutn2
If the shell script and executable are not in the same folder, you can use either an absolute path or an appropriate relative path.
Also, to run the script you will either need to make it executable:
$ chmod +x myscript.sh
$ ./myscript.sh
or invoke the shell with the script as an argument:
$ sh myscript.sh