How to compile the PMI support for running Chapel/GASNet on Omni-Path networks? - chapel

I'm trying to run Chapel/GASNet on a cluster equipped with Omni-path network.
GASNet official documentation for Omni-Path recommends to use the ofi-conduit by passing --enable-ofi --disable-psm --disable-ibv. However, as I do not know where to pass this configuration, I decided to use the PSM conduit for Omni-Path.
1) I can run Chapel/GASNet using GASNET_PSM_SPAWNER='ssh'. However, this spawner is resulting in quite slow PGAS.
2) I can only use MPI as the spawner if I set -mca mtl ^psm,psm2, which is also slow. Otherwise, I receive several errors.
3) I tried to use PMI as the spawner. However, I receive the following error message: Spawner is set to PMI, but PMI support was not compiled in usage: gasnetrun...
How can I compile the PMI support and for using GASNET_PSM_SPAWNER='pmi'?
Here are my other Chapel/GASNet runtime variables:
CHPL_COMM='gasnet'
CHPL_LAUNCHER='gasnetrun_psm'
CHPL_COMM_SUBSTRATE='psm'
CHPL_GASNET_SEGMENT='everything'
CHPL_TARGET_ARCH='native'
HFI_NO_CPUAFFINITY=1
All the best,
Tiago Carneiro.

I don't have easy access to an Omni-path system to test any of this, but in the interest of trying to get you an answer:
It appears to me as though Chapel ought to build and use the ofi-conduit if you do the following:
set CHPL_COMM_SUBSTRATE=ofi in your environment (e.g., export CHPL_COMM_SUBSTRATE=ofi)
re-build Chapel (e.g., make or gmake from $CHPL_HOME)
re-compile and re-run your program
The choice of spawner/launcher that you use should not have an impact on your program's performance that I am aware of... It is simply the mechanism for getting the executables up and running on the system's compute nodes. That is, if you have a technique that is working, I'd suggest sticking with it rather than trying to use other spawners/launchers (In any case, I'm not personally familiar with how to use the PMI spawner and am fairly certain that Chapel doesn't currently have a launcher that wraps it).
By contrast, the choice of conduit can have a very large impact on program performance, as it governs how communication takes place throughout the program's execution.
As a reminder: As with any Chapel program, once you have it working correctly and are doing performance studies, be sure to use the --fast flag.

Related

how to append data to an executable and use it inside this application

I want to build an C#/Wpf Packer.
I am using a C++ application which will start the packed/crypted C# application.
Currently I have to build this C++ app everytime I want to release my Main-App.
(C#/Wpf-App is included as external Array of Bytes)
Now I want to build a simple tool to do this work, but I dont want to build the "launcher" all the time!
So my idea is just to modify the launcher.
For that I need a way to modify this executable and I need to be able to use this modified data inside the launcher, like it would be compiled.
I dont want to reserve a static sized array inside the launcher, cause I dont know what could be the biggest data-size.
There are different ways to do that. Unfortunately none of them is straightforward nor standard except one: use a makefile that automatically builds the launcher. IMHO, unless you have special requirements such as building the launcher on a system with no development environment, it is probably the most simple and robust way.
As you explicitely ask for other solutions, I will give 2:
on Windows, you could store the c# app as a resource. Once that's done, you can use a resource editor to change it on the fly or build a custom editor using the WinAPI functions BeginUpdateResource, UpdateResource and EndUpdateResource. You later load the resource with LoadResource in the launcher.
you could make the laucher program know its real size and just seek behind that size and load what follows as the C# app. To build it, you just need to copy the actual C++ executable and whatever you want it to process. The hard part here is that there is no portable way to identify the size or the end of an executable at compile time. You could try a two pass build:
first pass, you set the size to an arbitrary value. You build and look at the real size
second pass, you set the size to what has been observed in first pass. As you only modify a size_t value, the size of the executable should not change. But I strongly urge you to control that size twice. Repeat if it is not the same (it could happen if the compiler was too clever and merged identical constants).
But as I already said, my choice would be to use a makefile to automatically generate the launcher each time the C# app is rebuilt

OpenSSL PKCS#11 signing multi process

I'm debugging a segmentation fault in a php module written by someone for an application (so changing the workflow and other time consuming operations are out of the question).
I have the following code:
...
...some code...
int marker=0;
ENGINE_load_dynamic();
ENGINE *e=ENGINE_by_id("dynamic");
if (e==NULL) return NULL;
...some more code to set some parameters using ENGINE_ctrl_cmd_string(...)
marker++; // gets about 10 or something
e=ENGINE_by_id("pkcs11");
if (e==NULL) return NULL;
Here comes the fun part - SIGSEGV:
marker++; //11
if (!ENGINE_init(e)){
std::cout<<"..error..";
ENGINE_finish(e);
ENGINE_free(e);
ENGINE_cleanup();
return null;
}
...code using pkcs#11 token that does work....
The problem appears in a random manner, sort of. The snippet is part of a php module. The script is called from a PostgreSQL script which in turn is called by another php application residing on another server (don't blame me for this design, I'm here to debug). The SIGSEGV appears when I refresh the main php application page quickly, which I assume it calls the above scripts multiple times concurrently, therefore attempting to use the token from separate processes at the same time.
Are my assumptions correct? can calls to ENGINE_init/finish/free from separate processes using the same token collide and cause a segmentation fault?
The segmentation fault is captured using my handler that picks up the marker value and prints it before exit, it's the simplest method I could come up for sigsegv debug. If this method might yield wrong results I'd appreciate the notification.
Any thoughts?
There's a README.ENGINE that provides a discussion of engines. I'm not sure how useful it will be since it makes some tall claims. For example, "... the source code is reasonably well self-documenting, but some summaries and usage instructions are needed".
But here's something on the dynamic:
The new "dynamic" ENGINE provides a low-overhead way to support
ENGINE implementations that aren't pre-compiled and linked into
OpenSSL-based applications. This could be because existing
compiled-in implementations have known problems and you wish to use
a newer version with an existing application. It could equally be
because the application (or OpenSSL library) you are using simply
doesn't have support for the ENGINE you wish to use, and the ENGINE
provider (eg. hardware vendor) is providing you with a
self-contained implementation in the form of a shared-library. The
other use-case for "dynamic" is with applications that wish to
maintain the smallest foot-print possible and so do not link in
various ENGINE implementations from OpenSSL, but instead leaves you
to provide them, if you want them, in the form of "dynamic"-loadable
shared-libraries. It should be possible for hardware vendors to
provide their own shared-libraries to support arbitrary hardware to
work with applications based on OpenSSL 0.9.7 or later. If you're
using an application based on 0.9.7 (or later) and the support you
desire is only announced for versions later than the one you need,
ask the vendor to backport their ENGINE to the version you need.
How does "dynamic" work?
The dynamic ENGINE has a special flag in its implementation such that
every time application code asks for the 'dynamic' ENGINE, it in fact
gets its own copy of it. As such, multi-threaded code (or code that
multiplexes multiple uses of 'dynamic' in a single application in any
way at all) does not get confused by 'dynamic' being used to do many
independent things. Other ENGINEs typically don't do this so there is
only ever 1 ENGINE structure of its type (and reference counts are used
to keep order). The dynamic ENGINE itself provides absolutely no
cryptographic functionality, and any attempt to "initialise" the ENGINE
automatically fails. All it does provide are a few "control commands"
that can be used to control how it will load an external ENGINE
implementation from a shared-library. To see these control commands,
use the command-line;
openssl engine -vvvv dynamic
The "SO_PATH" control command should be used to identify the
shared-library that contains the ENGINE implementation, and "NO_VCHECK"
might possibly be useful if there is a minor version conflict and you
(or a vendor helpdesk) is convinced you can safely ignore it.
"ID" is probably only needed if a shared-library implements
multiple ENGINEs, but if you know the engine id you expect to be using,
it doesn't hurt to specify it (and this provides a sanity check if
nothing else). "LIST_ADD" is only required if you actually wish the
loaded ENGINE to be discoverable by application code later on using the
ENGINE's "id". For most applications, this isn't necessary - but some
application authors may have nifty reasons for using it. The "LOAD"
command is the only one that takes no parameters and is the command
that uses the settings from any previous commands to actually *load*
the shared-library ENGINE implementation. If this command succeeds, the
(copy of the) 'dynamic' ENGINE will magically morph into the ENGINE
that has been loaded from the shared-library. As such, any control
commands supported by the loaded ENGINE could then be executed as per
normal. Eg. if ENGINE "foo" is implemented in the shared-library
"libfoo.so" and it supports some special control command "CMD_FOO", the
following code would load and use it (NB: obviously this code has no
error checking);
ENGINE *e = ENGINE_by_id("dynamic");
ENGINE_ctrl_cmd_string(e, "SO_PATH", "/lib/libfoo.so", 0);
ENGINE_ctrl_cmd_string(e, "ID", "foo", 0);
ENGINE_ctrl_cmd_string(e, "LOAD", NULL, 0);
ENGINE_ctrl_cmd_string(e, "CMD_FOO", "some input data", 0);
For testing, the "openssl engine" utility can be useful for this sort
of thing. For example the above code excerpt would achieve much the
same result as;
openssl engine dynamic \
-pre SO_PATH:/lib/libfoo.so \
-pre ID:foo \
-pre LOAD \
-pre "CMD_FOO:some input data"
Or to simply see the list of commands supported by the "foo" ENGINE;
openssl engine -vvvv dynamic \
-pre SO_PATH:/lib/libfoo.so \
-pre ID:foo \
-pre LOAD
Applications that support the ENGINE API and more specifically, the
"control commands" mechanism, will provide some way for you to pass
such commands through to ENGINEs. As such, you would select "dynamic"
as the ENGINE to use, and the parameters/commands you pass would
control the *actual* ENGINE used. Each command is actually a name-value
pair and the value can sometimes be omitted (eg. the "LOAD" command).
Whilst the syntax demonstrated in "openssl engine" uses a colon to
separate the command name from the value, applications may provide
their own syntax for making that separation (eg. a win32 registry
key-value pair may be used by some applications). The reason for the
"-pre" syntax in the "openssl engine" utility is that some commands
might be issued to an ENGINE *after* it has been initialised for use.
Eg. if an ENGINE implementation requires a smart-card to be inserted
during initialisation (or a PIN to be typed, or whatever), there may be
a control command you can issue afterwards to "forget" the smart-card
so that additional initialisation is no longer possible. In
applications such as web-servers, where potentially volatile code may
run on the same host system, this may provide some arguable security
value. In such a case, the command would be passed to the ENGINE after
it has been initialised for use, and so the "-post" switch would be
used instead. Applications may provide a different syntax for
supporting this distinction, and some may simply not provide it at all
("-pre" is almost always what you're after, in reality).

how is the build order of kernel by Wince 6.0

I've changed the file "handle.c" in winceos\COREOS\nk\kernel.. and need to build according to take the changes into the core.dll for nk.bin
is there any build order to follow to avoid to build the hole solution?
First, let me say that making that change where you did is a bad, bad idea. Never change the public or private trees directly. If Microsoft issues a QFE that changes that code, when you apply the QFE, your changes will be overwritten and without warning. Always clone the code and change the clone.
As far as making kernel changes without having to rebuild the entire project, the answer is no, you can't. Changes in the code potentially change addresses, and a vast amount of the OS is fixed up with those addresses during the build process. You'll have to rebuild the entire thing after a change like that (as opposed to, for example, drivers which you can build individually without rebuilding the entire OS).
thanks for your answer.
what I found now by trying myself is yes, it's possible by doing "build & sysgen" of "winceos" folder under PRIVATE.
The change execution on kernel code was just adding a RETAILMSG to see the HANDLE count.
The file handle.c create handle table and give handles. There is a number of commands creating/allocating handle. I do not really know, by which handle requests the kernel calls handle.c(??), but it "can" for some developers be usefull to be able to manuplate it??
But in summary, doing "build & sysgen"+"MakeRunTimeImage" makes the changes on kernel valid.
I did it on "PRIVATE/winceos", but perhaps it's also possible by doing iy more locally, for example on PRIVATE/winceos/COREOS/nk/kernel folder. I didn't tried it ;)

How to use gprof to profile a daemon process without terminating it gracefully?

Need to profile a daemon written in C++, gprof says it need to terminate the process to get the gmon.out. I'm wondering anyone has ideas to get the gmon.out with ctrl-c? I want to find out the hot spot for cpu cycle
Need to profile a daemon written in C++, gprof says it need to terminate the process to get the gmon.out.
That fits the normal practice of debugging daemon processes: provision a switch (e.g. with command line option) which would force the daemon to run in foreground.
I'm wondering anyone has ideas to get the gmon.out with ctrl-c?
I'm not aware of such options.
Though in case of gmon, call to exit() should suffice: if you for example intend to test say processing 100K messages, you can add in code a counter incremented on every processed message. When the counter exceeds the limit, simply call exit().
You also can try to add a handler for some unused signal (like SIGUSR1 or SIGUSR2) and call exit() from there. Thought I do not have personal experience and cannot be sure that gmon would work properly in the case.
I want to find out the hot spot for cpu cycle
My usual practice is to create a test application, using same source code as the daemon but different main() where I simulate precise scenario (often with a command line switch many scenarios) I need to debug or test. For the purpose, I normally create a static library containing the whole module - except the file with main() - and link the test application with the static library. (That helps keeping Makefiles tidy.)
I prefer the separate test application to hacks inside of the code since especially in case of performance testing I can sometimes bypass or reduce calls to expensive I/O (or DB accesses) which often skews the profiler's sampling and renders the output useless.
As a first suggestion I would say you might try to use another tool. If the performance of that daemon is not an issue in your test you could give a try to valgrind. It is a wonderful tool, I really love it.
If you want to make the daemon go as fast as possible, you can use lsstack with this technique. It will show you what's taking time that you can remove. If you're looking for hot spots, you are probably looking for the wrong thing. Typically there are function calls that are not absolutely needed, and those don't show up as hot spots, but they do show up on stackshots.
Another good option is RotateRight/Zoom.

Error handling / error logging in C++ for library/app combo

I've encountered the following problem pattern frequently over the years:
I'm writing complex code for a package comprised of a standalone application and also a library version of the core that people can use from inside other apps.
Both our own app and presumably ones that users create with the core library are likely to be run both in batch mode (off-line, scripted, remote, and/or from command line), as well as interactively.
The library/app takes complex and large runtime input and there may be a variety of error-like outputs including severe error messages, input syntax warnings, status messages, and run statistics. Note that these are all incidental outputs, not the primary purpose of the application which would be displayed or saved elsewhere and using different methods.
Some of these (probably only the very severe ones) might require a dialog box if run interactively; but it needs to log without stalling for user input if run in batch mode; and if run as a library the client program obviously wants to intercept and/or examine the errors as they occur.
It all needs to be cross-platform: Linux, Windows, OSX. And we want the solution to not be weird on any platform. For example, output to stderr is fine for Linux, but won't work on Windows when linked to a GUI app.
Client programs of the library may create multiple instances of the main class, and it would be nice if the client app could distinguish a separate error stream with each instance.
Let's assume everybody agrees it's good enough for the library methods to log errors via a simple call (error code and/or severity, then printf-like arguments giving an error message). The contentious part is how this is recorded or retrieved by the client app.
I've done this many times over the years, and am never fully satisfied with the solution. Furthermore, it's the kind of subproblem that's actually not very important to users (they want to see the error log if something goes wrong, but they don't really care about our technique for implementing it), but the topic gets the programmers fired up and they invariably waste inordinate time on this detail and are never quite happy.
Anybody have any wisdom for how to integrate this functionality into a C++ API, or is there an accepted paradigm or a good open source solution (not GPL, please, I'd like a solution I can use in commercial closed apps as well as OSS projects)?
We use Apache's Log4cxx for logging which isn't perfect, but provides a lot of infrastructure and a consistent approach across projects. I believe it is cross-platform, though we only use it on Windows.
It provides for run time configuration via an ini file which allows you to control how the log file is output, and you could write your own appenders if you want specific behaviour (e.g. an error dialog under the UI).
If clients of your library also adopt it then it would integrate their logging output into the same log file(s).
Differentiation between instances of the main class could be supported using the nested diagnostic context (NDC) feature.
Log4Cxx should work for you. You need to implement a provider that allows the library user to catch the log output in callbacks. The library would export a function to install the callbacks. That function should, behind the scenes, reconfigure log4cxxx to get rid of all appenders and set up the "custom" appender.
Of course, the library user has an option to not install the callbacks and use log4cxx as is.