One source needs to compile differently on multiple machines

One source needs to compile differently on multiple machines - c++

I have a program in C++ that is designed to run a simulation for a summer project I'm doing. It is pretty computationally intensive, but I have gotten permission to use a cluster computer's resources to run it, but I test it and develop it on my own laptop. This program generates text files as output, and this is where I run into trouble.
I need the text files to be saved in different paths depending on whether I'm running the program on my own computer or on the cluster computer. My solution for now has been to use $(shell hostname) in my makefile to check which machine the code is being compiled on and, from that output, use conditional compilation with macros defined from that operation in the makefile. At one time, I was using two different versions of a header that defined macros differently on my computer versus the cluster, but I'm using a git repository to transfer changes back and forth, and I was having a very difficult time excluding one file like this.
I was just wondering what is the most preferable practice to set paths at compile time on different computers with the same source.

It doesn't sound to me like it needs to compile differently on different machines. It sounds like it needs to take some paths at run-time from either the command line, or from some sort of config file.
One suggestion would be to use the boost program options library which in one simple setup allows you to read the same params either from the command line or from a config file. This is what I used when running similar jobs on a big cluster or on my laptop and it worked nicely.
Below is a simple example from their docs:
// Declare the supported options.
po::options_description desc("Allowed options");
desc.add_options()
("help", "produce help message")
("compression", po::value<int>(), "set compression level")
;
po::variables_map vm;
po::store(po::parse_command_line(ac, av, desc), vm);
po::notify(vm);
if (vm.count("help")) {
cout << desc << "\n";
return 1;
}
if (vm.count("compression")) {
cout << "Compression level was set to "
<< vm["compression"].as<int>() << ".\n";
} else {
cout << "Compression level was not set.\n";
}

I agree with Alex, the easiest solution will not be at compile time, but at runtime either via a config file or command line arguments. All other things being equal, it may be easier for you to just try passing it via command line arguments using argv and argc.

I am not very experienced with this, but I can think of one simple way of doing this.
Set up an environment variable that points to the appropriate directory on each machine,
and use that environment variable in your makefile.
For example,
in machine 1's ~/.bashrc
export MY_DIRECTORY = ~/Foo
in machine 2's ~/.bashrc
export MY_DIRECTORY = ~/Bar
your Makefile will use the environment variable of the machine it is running on.
Eg. $(MY_DIRECTORY)
And (~/.bashrc is not a part of your repository, so different copies can exist on the two machines)

If you can stomach a dependency on QtCore (between 750K and 4MB library depending on compile options and platform), you can use QSettings to conveniently store the directory path without having to set the directory path each time. You can pass it on the command line at runtime once and have the program store the result into the settings file, and then that setting will become the new default for future invocations of the program without the command line argument.
Other dependency-free alternatives would involve writing your own configuration file parsing routines or using existing ones, but I always like to rely on well-tested and open source code.
Good luck!

You could continue with the route of using separate headers. Include both in your Git repository as clusterHeader.h and laptopHeader.h and use your existing Makefile script to build with a different header on each system. To ease linking troubles, perhaps temporarily rename or copy the file within the script from clusterHeader.hpp or laptopHeader.hpp to just plain old header.h while building, and change the name back at the end of the script.
If you want to keep changes in your header consistent between builds, use this method as another header file, and #include that header within your OS independent header.
i.e.
source.cpp
|
-> header.hpp
|
-> clusterHeader.hpp OR laptopHeader.hpp
Alternatively, as long as the systems aren't exactly the same OS (which I'm assuming they aren't since one is a cluster), you could probably quite easily get it to work with some simple #ifdef statements.
Finally, CMake or qmake are always options.

Related

non-hermetic Bazel action to enable remote caching

I've been iterating on a bazel rule for a tool that is dependent on a "custom" (verilator if you're familiar). This tool is supposed to read arguments and inputs and generate cpp files. The action that invokes verilator is defined below
ctx.actions.run(
arguments = [args],
executable = verilator_toolchain.verilator_bin,
inputs = inputs,
outputs = [verilator_output],
progress_message = "[Verilator] Compiling {}".format(ctx.label),
)
The problem is that the executable given to this action is not /exactly/ the same across platforms -- it is slightly larger, has a different hash when comparing mac and linux executables here.
I can trust that the output can be the same, and I'd like to share a remote cache for this action for both platforms; is there a "best practice" where I can rewrite this action to be non-hermetic so the toolchain binary isn't considered as an "input" to the cache? I think the cpp rules do something similar to this.

No, outside of writing an incorrect, non-hermetic rule, there's no way to prevent Bazel for putting all action inputs into the hash key.

Fastest way to make console output "verbose" or not

I am making a small system and I want to be able to toggle "verbose" text output in the whole system.
I have made a file called globals.h:
namespace REBr{
extern bool console_verbose = false;
}
If this is true I want all my classes to print a message to the console when they are constructing, destructing, copying or doing pretty much anything.
For example:
window(string title="",int width=1280,int height=720):
Width(width),Height(height),title(title)
{
if(console_verbose){
std::cout<<"Generating window #"<<this->instanceCounter;
std::cout<<"-";
}
this->window=SDL_CreateWindow(title.c_str(),0,0,width,height,SDL_WINDOW_OPENGL);
if(console_verbose)
std::cout<<"-";
if(this->window)
{
this->glcontext = SDL_GL_CreateContext(window);
if(console_verbose)
std::cout<<".";
if(this->glcontext==NULL)
{
std::cout<<"FATAL ERROR IN REBr::WINDOW::CONSTR_OPENGLCONTEXT: "<<SDL_GetError()<<std::endl;
}
}
else std::cout<<"FATAL ERROR IN REBr::WINDOW::CONSTR_WINDOW: "<<SDL_GetError()<<std::endl;
if(console_verbose)
std::cout<<">done!"<<endl;
}
Now as you can see I have a lot of ifs in that constructor. And I REALLY dont want that since that will slow down my application. I need this to be as fast as possible without removing the "loading bar" (this helps me determine at which function the program stopped functioning).
What is the best/fastest way to accomplish this?
Everying in my system is under the namespace REBr

Some variants to achieve that:
Use some logger library. It is the best option as it gives you maximum flexibility and some useful experience ;) And you haven't to devise something. For example, look at Google GLOG.
Define some macro, allowing you to turn on/off all these logs by changing only the macro. But it isn't so easy to write such marco correctly.
Mark your conditional flag as constexpr. That way you may switch the flag and, depending on its value, compiler will optimise ifs in compiled program. But ifs will still be in code, so it looks kinda bulky.
Anyway, all these options require program recompilation. W/o recompilation it is impossible to achieve the maximum speed.

I often use a Logger class that supports debug levels. A call might look like:
logger->Log(debugLevel, "%s %s %d %d", timestamp, msg, value1, value2);
The Logger class supports multiple debug levels so that I can fine tune the debug output. This can be set at any time through the command line or with a debugger. The Log statement uses a variable length argument list much like printf.

Google's logging module is widely used in the industry and supports logging levels that you can set from the command line. For example (taken from their documentation)
VLOG(1) << "I'm printed when you run the program with --v=1 or higher";
VLOG(2) << "I'm printed when you run the program with --v=2 or higher";
You can find the code here https://github.com/google/glog and the documentation in the doc/ folder.

How I can monitor the output files and move/rename in desired directory

A program generates a text file after every 15 iterations. It overwrites the output.txt (formed at 15th step) with a new output.txt (formed at 30th step), due to using the same name. I can't modify the file name within the program. Can I run some script concurrently with the program on my Ubuntu system that monitors my directory and moves the output.txt file to a desired directory when it is formed or changes the output file name?

I can't modify the file name within the program.
(I take this to mean you are required to not change the file name, not that you don't know how.)
You've marked this posting as C++.
While it is possible to run some script to monitor a directory, coordinating the name change and running a thread or another process (from C++) can be much more challenging than other choices.
How about a simpler approach:
I suggest using std::stringstream to generate a unique pfn (path-file-name) for each time you want to write a file. For instance, an incrementing number can be appended to the unmodifiable-file-name.
Something like:
std::string uniqueFileName(void)
{
std::stringstream ss;
// vvvvvvvvvv -- unmodifiable-file-name is not changed
ss << "output.txt" << ++fileCount;
uniqueFileName = ss.str();
return(uniqueFileName);
}
Good luck.
PS
If you feel you must write the file first in the correct file name, and then change the file name to something unique ... yes, you can rename the file from within this program (i.e. trivial synchronization)
I would use popen() as I feel it provides more feedback, and I've used it before.
Others prefer something like system() (there are about 6 of these).
In either case, use the command to rename the existing file (to each you provide a bash command, like mv fromPfn toPfn, or maybe you'll need cp.
For each, your code must not proceed until the command has completed.

How to properly debug a binary generated by `go test -c` using GDB?

The go test command has support for the -c flag, described as follows:
-c Compile the test binary to pkg.test but do not run it.
(Where pkg is the last element of the package's import path.)
As far as I understand, generating a binary like this is the way to run it interactively using GDB. However, since the test binary is created by combining the source and test files temporarily in some /tmp/ directory, this is what happens when I run list in gdb:
Loading Go Runtime support.
(gdb) list
42 github.com/<username>/<project>/_test/_testmain.go: No such file or directory.
This means I cannot happily inspect the Go source code in GDB like I'm used to. I know it is possible to force the temporary directory to stay by passing the -work flag to the go test command, but then it is still a huge hassle since the binary is not created in that directory and such. I was wondering if anyone found a clean solution to this problem.

Go 1.5 has been released, and there is still no officially sanctioned Go debugger. I haven't had much success using GDB for effectively debugging Go programs or test binaries. However, I have had success using Delve, a non-official debugger that is still undergoing development: https://github.com/derekparker/delve
To run your test code in the debugger, simply install delve:
go get -u github.com/derekparker/delve/cmd/dlv
... and then start the tests in the debugger from within your workspace:
dlv test
From the debugger prompt, you can single-step, set breakpoints, etc.
Give it a whirl!

Unfortunately, this appears to be a known issue that's not going to be fixed. See this discussion:
https://groups.google.com/forum/#!topic/golang-nuts/nIA09gp3eNU
I've seen two solutions to this problem.
1) create a .gdbinit file with a set substitute-path command to
redirect gdb to the actual location of the source. This file could be
generated by the go tool but you'd risk overwriting someone's custom
.gdbinit file and would tie the go tool to gdb which seems like a bad
idea.
2) Replace the source file paths in the executable (which are pointing
to /tmp/...) with the location they reside on disk. This is
straightforward if the real path is shorter then the /tmp/... path.
This would likely require additional support from the compiler /
linker to make this solution more generic.
It spawned this issue on the Go Google Code issue tracker, to which the decision ended up being:
https://code.google.com/p/go/issues/detail?id=2881
This is annoying, but it is the least of many annoying possibilities.
As a rule, the go tool should not be scribbling in the source
directories, which might not even be writable, and it shouldn't be
leaving files elsewhere after it exits. There is next to nothing
interesting in _testmain.go. People testing with gdb can break on
testing.Main instead.
Russ
Status: Unfortunate
So, in short, it sucks, and while you can work around it and GDB a test executable, the development team is unlikely to make it as easy as it could be for you.

I'm still new to the golang game but for what it's worth basic debugging seems to work.
The list command you're trying to work can be used so long as you're already at a breakpoint somewhere in your code. For example:
(gdb) b aws.go:54
Breakpoint 1 at 0x61841: file /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.go, line 54.
(gdb) r
Starting program: /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.test
[snip: some arnings about BinaryCache]
Breakpoint 1, github.com/stellar/deliverator/aws.imageIsNewer (latest=0xc2081fe2d0, ami=0xc2081fe3c0, ~r2=false)
at /Users/mat/gocode/src/github.com/stellar/deliverator/aws/aws.go:54
54 layout := "2006-01-02T15:04:05.000Z"
(gdb) list
49 func imageIsNewer(latest *ec2.Image, ami *ec2.Image) bool {
50 if latest == nil {
51 return true
52 }
53
54 layout := "2006-01-02T15:04:05.000Z"
55
56 amiCreationTime, amiErr := time.Parse(layout, *ami.CreationDate)
57 if amiErr != nil {
58 panic(amiErr)
This is just after running the following in the aws subdir of my project:
go test -c
gdb aws.test
As an additional caveat, it does seem very selective about where breakpoints can be placed. Seems like it has to be an expression but that conclusion is only via experimentation.

If you're willing to use tools besides GDB, check out godebug. To use it, first install with:
go get github.com/mailgun/godebug
Next, insert a breakpoint somewhere by adding the following statement to your code:
_ = "breakpoint"
Now run your tests with the godebug test command.
godebug test
It supports many of the parameters from the go test command.
-test.bench string
regular expression per path component to select benchmarks to run
-test.benchmem
print memory allocations for benchmarks
-test.benchtime duration
approximate run time for each benchmark (default 1s)
-test.blockprofile string
write a goroutine blocking profile to the named file after execution
-test.blockprofilerate int
if >= 0, calls runtime.SetBlockProfileRate() (default 1)
-test.count n
run tests and benchmarks n times (default 1)
-test.coverprofile string
write a coverage profile to the named file after execution
-test.cpu string
comma-separated list of number of CPUs to use for each test
-test.cpuprofile string
write a cpu profile to the named file during execution
-test.memprofile string
write a memory profile to the named file after execution
-test.memprofilerate int
if >=0, sets runtime.MemProfileRate
-test.outputdir string
directory in which to write profiles
-test.parallel int
maximum test parallelism (default 4)
-test.run string
regular expression to select tests and examples to run
-test.short
run smaller test suite to save time
-test.timeout duration
if positive, sets an aggregate time limit for all tests
-test.trace string
write an execution trace to the named file after execution
-test.v
verbose: print additional output

What is the point of clog?

I've been wondering, what is the point of clog? As near as I can tell, clog is the same as cerr but with buffering so it is more efficient. Usually stderr is the same as stdout, so clog is the same as cout. This seems pretty lame to me, so I figure I must be misunderstanding it. If I have log messages going out to the same place I have error messages going out to (perhaps something in /var/log/messages), then I probably am not writing too much out (so there isn't much lost by using non-buffered cerr). In my experience, I want my log messages up to date (not buffered) so I can help find a crash (so I don't want to be using the buffered clog). Apparently I should always be using cerr.
I'd like to be able to redirect clog inside my program. It would be useful to redirect cerr so that when I call a library routine I can control where cerr and clog go to. Can some compilers support this? I just checked DJGPP and stdout is defined as the address of a FILE struct, so it is illegal to do something like "stdout = freopen(...)".
Is it possible to redirect clog, cerr, cout, stdin, stdout, and/or stderr?
Is the only difference between clog and cerr the buffering?
How should I implement (or find) a more robust logging facility (links please)?

Is it possible to redirect clog, cerr, cout, stdin, stdout, and/or stderr?
Yes. You want the rdbuf function.
ofstream ofs("logfile");
cout.rdbuf(ofs.rdbuf());
cout << "Goes to file." << endl;
Is the only difference between clog and cerr the buffering?
As far as I know, yes.

If you're in a posix shell environment (I'm really thinking of bash), you can redirect any
file descriptor to any other file descriptor, so to redirect, you can just:
$ myprogram 2>&5
to redirect stderr to the file represented by fd=5.
Edit: on second thought, I like #Konrad Rudolph's answer about redirection better. rdbuf() is a more coherent and portable way to do it.
As for logging, well...I start with the Boost library for all things C++ that isn't in the std library. Behold: Boost Logging v2
Edit: Boost Logging is not part of the Boost Libraries; it has been reviewed, but not accepted.
Edit: 2 years later, back in May 2010, Boost did accept a logging library, now called Boost.Log.
Of course, there are alternatives:
Log4Cpp (a log4j-style API for C++)
Log4Cxx (Apache-sponsored log4j-style API)
Pantheios (defunct? last time I tried I couldn't get it to build on a recent compiler)
Google's GLog (hat-tip #SuperElectric)
There's also the Windows Event logger.
And a couple of articles that may be of use:
Logging in C++ (Dr. Dobbs)
Logging and Tracing Simplified (Sun)

Since there are several answers here about redirection, I will add this nice gem I stumbled across recently about redirection:
#include <fstream>
#include <iostream>
class redirecter
{
public:
redirecter(std::ostream & dst, std::ostream & src)
: src(src), sbuf(src.rdbuf(dst.rdbuf())) {}
~redirecter() { src.rdbuf(sbuf); }
private:
std::ostream & src;
std::streambuf * const sbuf;
};
void hello_world()
{
std::cout << "Hello, world!\n";
}
int main()
{
std::ofstream log("hello-world.log");
redirecter redirect(log, std::cout);
hello_world();
return 0;
}
It's basically a redirection class that allows you to redirect any two streams, and restore it when you're finished.

Redirections
Konrad Rudolph answer is good in regard to how to redirect the std::clog (std::wclog).
Other answers tell you about various possibilities such as using a command line redirect such as 2>output.log. With Unix you can also create a file and add another output to your commands with something like 3>output.log. In your program you then have to use fd number 3 to print the logs. You can continue to print to stdout and stderr normally. The Visual Studio IDE has a similar feature with their CDebug command, which sends its output to the IDE output window.
stderr is the same as stdout?
This is generally true, but under Unix you can setup the stderr to /dev/console which means that it goes to another tty (a.k.a. terminal). It's rarely used these days. I had it that way on IRIX. I would open a separate X-Window and see errors in it.
Also many people send error messages to /dev/null. On the command line you write:
command ...args... 2>/dev/null
syslog
One thing not mentioned, under Unix, you also have syslog().
The newest versions under Linux (and probably Mac OS/X) does a lot more than it used to. Especially, it can use the identity and some other parameters to redirect the logs to a specific file (i.e. mail.log). The syslog mechanism can be used between computers, so logs from computer A can be sent to computer B. And of course you can filter logs in various ways, especially by severity.
The syslog() is also very simple to use:
syslog(LOG_ERR, "message #%d", count++);
It offers 8 levels (or severity), a format a la printf(), and a list of arguments for the format.
Programmatically, you may tweak a few things if you first call the openlog() function. You must call it before your first call to syslog().
As mentioned by unixman83, you may want to use a macro instead. That way you can include some parameters to your messages without having to repeat them over and over again. Maybe something like this (see Variadic Macro):
// (not tested... requires msg to be a string literal)
#define LOG(lvl, msg, ...) \
syslog(lvl, msg " (in " __FILE__ ":%d)", __VA_ARGS__, __LINE__)
You may also find __func__ useful.
The redirection, filtering, etc. is done by creating configuration files. Here is an example from my snapwebsites project:
mail.err /var/log/mail/mail.err
mail.* /var/log/mail/mail.log
& stop
I install the file under /etc/rsyslog.d/ and run:
invoke-rc.d rsyslog restart
so the syslog server handles that change and saves any mail related logs to those folders.
Note: I also have to create the /var/log/mail folder and the files inside the folder to make sure it all works right (because otherwise the mail daemon may not have enough permissions.)
snaplogger (a little plug)
I've used log4cplus, which, since version 1.2.x, is quite good. I have three cons about it, though:
it requires me to completely clear everything if I want to call fork(); somehow it does not survive a fork(); call properly... (at least in the version I had it used a thread)
the configuration files (.properties) are not easy to manage in my environment where I like the administrators to make changes without modifying the original
it uses C++03 and we are now in 2019... I'd like to have at least C++11
Because of that, and especially because of point (1), I wrote my own version called snaplogger. This is not exactly a standalone project, though. I use many other projects from the snapcpp environment (it's much easier to just get snapcpp and run the bin/build-snap script or just get the binaries from launchpad.)
The advantage of using a logger such as snaplogger or log4cplus is that you generally can define any number of destinations and many other parameters (such as the severity level as offered by syslog()). The log4cplus is capable of sending its output to many different places: files, syslog, MS-Windows log system, console, a server, etc. Check out the appenders in those two projects to have an idea of the list of possibilities. The interesting factor here is that any log can be sent to all the destinations. This is useful to have a file named all.log where all your services send their logs. This allows to understand certain bugs which would not be as easy with separate log files when running many services in parallel.
Here is a simple example in a snaplogger configuration file:
[all]
type=file
lock=true
filename=/var/log/snapwebsites/all.log
[file]
lock=false
filename=/var/log/snapwebsites/firewall.log
Notice that for the all.log file I require a lock so multiple writers do not mangle the logs between each others. It's not necessary for the [file] section because I only have one process (no threads) for that one.
Both offer you a way to add your own appenders. So for example if you have a Qt application with an output window, you could write an appender to send the output of the SNAP_LOG_ERROR() calls to that window.
snaplogger also offers you a way to extend the variable support in messages (also called the format.) For example, I can insert the date using the ${date} variable. Then I can tweak it with a parameter. To only output the year, I use ${date:year}. This variable parameter support is also extensible.
snaplogger can filter the output by severity (like syslog), by a regex, and by component. We have a normal and a secure component, the default is normal. I want logs sent to the secure component to be written to secure files. This means in a sub-directory which is way more protected than the normal logs that most admins can review. When I run my HTTP services, some times I send information such as the last 3 digits of a credit card. I prefer to have those in a secure log. It could also be password related errors. Anything I deem to be a security risk in a log, really. Again, components are extensible so you can have your own.

One little point about the redirecter class. It needs to be destroyed properly, and only once. The destructor will ensure this will happen if the function it is declared in actually returns, and the object itself is never copied.
To ensure it can't be copied, provide private copy and assignment operators:
class redirecter
{
public:
redirecter(std::ostream & src, std::ostream & dst)
: src_(src), sbuf(src.rdbuf(dst.rdbuf())) {}
~redirecter() { src.rdbuf(sbuf); }
private:
std::ostream & src_;
std::streambuf * const sbuf_;
// Prevent copying.
redirecter( const redirecter& );
redirecter& operator=( const redirecter& );
};
I'm using this technique by redirecting std::clog to a log file in my main(). To ensure that main() actually returns, I place the guts of main() in a try/catch block. Then elsewhere in my program, where I might call exit(), I throw an exception instead. This returns control to main() which can then execute a return statement.

Basic Logger
#define myerr(e) {CriticalSectionLocker crit; std::cerr << e << std::endl;}
Used as myerr("ERR: " << message); or myerr("WARN: " << message << code << etc);
Is very effective.
Then do:
./programname.exe 2> ./stderr.log
perl parsestderr.pl stderr.log
or just parse stderr.log by hand
I admit this is not for extremely performance critical code. But who writes that anyway.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js