OpenMPI Multiple Instructions Multiple Data (MIMD) syntax - c++

One of my MPI programs keeps crashing and I want to write specific debugging output for each task to a file. The file to which is written is specified via a command line argument. I read in the OpenMPI documentation that there is the possibility of launching several commands via the syntax
mpirun [global options] [local options 1] ./main --file debug.log1 : [local options 2] ./main --file debug.log2 : ...
However I was not able to find which flags are considered global and which ones local. Most important I don't know about the number of processes, i.e. "-np". Should I give it as a global option or a local one.
mpirun --map-by node --bind-to none -np 3 ./main --file debug.log1 : ./main --file debug.log2 : ./main --file debug.log3
or
mpirun --map-by node --bind-to none -np 1 ./main --file debug.log1 : -np 1 ./main --file debug.log2 : -np 1 ./main --file debug.log3
If the latter is true, does every binary get "-np 1" or "-np [#nodes]"? I suppose arguments like "--map-by" and "--bind-to" are global, is that correct?

Related

How to execute Rake file tasks in parallel

I'm using Rake to build a C language build system.
When compiling multiple files,
I want to compile multiple files in parallel, using multiple cores of the CPU.
Can you give me some advice on how to write a Rake file?
I would like to achieve the make -j in Rake.
As a restriction, I prefer not to install a new Gem.
For clarity, here is a simplified Rake file.
CC = "gcc"
task :default => "hello"
file "hello" => ["hello.o", "message.o"] do
sh "#{CC} -o hello hello.o message.o"
end
file "hello.o" => "hello.c" do (1)
sh "#{CC} -c hello.c"
end
file "message.o" => "message.c" do (2)
sh "#{CC} -c message.c"
end
For tasks, I can use multitask.
However, for file tasks, I don't know how to describe it.
I would like to run the file tasks (1) and (2) concurrently.
my environment:
ruby 2.6.4p104 (2019-08-28 revision 67798) [i386-cygwin]
I thank you in advance.
You can create threads and you can run your files in new thread.
For example
CC = "gcc"
task :default => "hello"
file "hello" => ["hello.o", "message.o"] do
Thread.new do
sh "#{CC} -o hello hello.o message.o"
end
end
file "hello.o" => "hello.c" do
Thread.new do
sh "#{CC} -c hello.c"
end
end
file "message.o" => "message.c" do
Thread.new do
sh "#{CC} -c message.c"
end
end
I found out that rake -m considers all tasks as multitasking. I defined individual tasks only for the tasks that I do not want to be executed in parallel. By specifying something like "rake -m -j 8", I confirmed that the build time was reduced.

mpich2 2 executables standard output

I need to launch 2 executables (program1 and program2) with the same mpirun (mpich) command, and I'm trying to debug program1 with gdb. I use this command:
mpirun -n 1 gdb program1 : -n 1 program2
The command correctly opens the gdb console, but if I set a breakpoint somewhere after mpi_init, the screen gets flooded with the standard output of program2. Is there a simple way to redirect the standard output of program2 (only program2) to a file?
My fast solution was to hard-code a cut of stdout in program2, but I'm sure there must be a more elegant one ...
You could try this:
create a wrapper.sh that redirect the stdout and stderr to the file output :
cat > wrapper.sh << EOF
#!/bin/bash
\$* 1>>output 2>&1
EOF
make it executable
chmod +x wrapper.sh
mpirun with the wrapper
mpirun -n 1 gdb program1 : -n 1 ./wrapper.sh program2

Sending pcap file via packetgen dpdk

Sending a pcap file on port 0. I get the following error. Any fix would be appreciated!
The command used is:
sudo ./app/x86_64-native-linuxapp-gcc/pktgen -c 0X01 -n 1 --file-prefix=pg -w 4:00.1 -- -m 1.0 -T -P -s 0:~/Downloads/bigFlows.pcap
There are 2 obvious reasons for the failure.
Number of CPU cores for pktgen to work is 1 + number of ports in use
you have extra argument in comamnd executed in pktgen.
Checking the link, it show the command used is sudo ./app/x86_64-native-linuxapp-gcc/pktgen -c 0X01 -n 1 --file-prefix=pg -w 4:00.1 -- -m 1.0 -T -P -s 0:[~/Downloads/bigFlows.pcap]. You should not sue [] instead use 0:actual path to pcap.
Note: #SaifUllah during the live debug both core and pcap were show cased for you.

how to hold xterm during debugging an mpi program?

I run the debugger via
mpirun -n 4 xterm -e gdb -x commands.gdb ./my_mpi_programm
where the file "commands.gdb" just contains the commands
start
continue
The problem is that my 4 xterm immediately close before I get a chance to inspect the error message or do any debugging.
I'm using the latest ubuntu distribution. However, on my friend's old Suse-distribution, xterm is held open.
How can I force the xterms to stay?
EDIT: the "-hold" option doesnt work as well as
mpirun -n 4 xterm -e "gdb -x commands.gdb ./my_mpi_programm;bash"
Try
mpirun -n 4 xterm -e bash -c 'gdb -x commands.gdb ./my_mpi_programm; sleep 60'

Using gprof with LULESH benchmark

I've been trying to compile and run LULESH benchmark
https://codesign.llnl.gov/lulesh.php
https://codesign.llnl.gov/lulesh/lulesh2.0.3.tgz
with gprof but I always get a segmentation fault. I updated these instructions in the Makefile:
CXXFLAGS = -g -pg -O3 -I. -Wall
LDFLAGS = -g -pg -O3
[andrestoga#n01 lulesh2.0.3]$ mpirun -np 8 ./lulesh2.0 -s 16 -p -i 10
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 30557 on node n01 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
In the webpage of gprof says the following:
If you are running the program on a system which supports shared
libraries you may run into problems with the profiling support code in
a shared library being called before that library has been fully
initialised. This is usually detected by the program encountering a
segmentation fault as soon as it is run. The solution is to link
against a static version of the library containing the profiling
support code, which for gcc users can be done via the -static' or
-static-libgcc' command line option. For example:
gcc -g -pg -static-libgcc myprog.c utils.c -o myprog
I added the -static command line option and I also got segmentation fault.
I found a pdf where they profiled LULESH by updating the Makefile by adding the command line option -pg. Although they didn't say the changes they made.
http://periscope.in.tum.de/releases/latest/pdf/PTF_Best_Practices_Guide.pdf
Page 11
Could someone help me out please?
Best,
Make sure all libraries are loaded:
openmpi (which you have done already)
gcc
You can try with parameters that will allow you to identify if the problem is your machine in terms of resources. if the machine does not support such number of processes see how many processes (MPI or not MPI) it supports by looking at the architecture topology. This will allow you to identify what is the correct amount of jobs/processes you can launch into the system.
Very quick run:
mpirun -np 1 ./lulesh2.0 -s 1 -p -i 1
Running problem size 1^3 per domain until completion
Num processors: 1
Num threads: 2
Total number of elements: 1
To run other sizes, use -s <integer>.
To run a fixed number of iterations, use -i <integer>.
To run a more or less balanced region set, use -b <integer>.
To change the relative costs of regions, use -c <integer>.
To print out progress, use -p
To write an output file for VisIt, use -v
See help (-h) for more options
cycle = 1, time = 1.000000e-02, dt=1.000000e-02
Run completed:
Problem size = 1
MPI tasks = 1
Iteration count = 1
Final Origin Energy = 4.333329e+02
Testing Plane 0 of Energy Array on rank 0:
MaxAbsDiff = 0.000000e+00
TotalAbsDiff = 0.000000e+00
MaxRelDiff = 0.000000e+00
Elapsed time = 0.00 (s)
Grind time (us/z/c) = 518 (per dom) ( 518 overall)
FOM = 1.9305019 (z/s)