If I'm using ASDF to build my Common Lisp system and have multiple projects in my ASDF path, how can I avoid naming collisions for my system names?
For example, if project-a and project-b both use a system utilities but they use different utilities, how can I make sure ASDF finds the correct utilites system?
I know that I could make utilities a directory instead of a system and reference individual files, but I find it desirable to use ASDF so I don't have to worry about specifying a path. Also, I'm trying to avoid a C-style way of avoiding collisions, e.g., project-a-utilities.
Is there a typical convention here that I don't know about?
Related
I've got a project with many files in it and I want it to work with most popular compilers.
Unfortunately, PolyML and SML/NJ require use statements, while MosML additionally requires explicitly loading basis library structures using load, which is not recognised by either poly or sml.
On top of that, MLton and MLKit require a completely different .mlb file simply listing filenames and also require an explicit import of basis library, which is done in a different way to MosML:
$(SML_LIB)/basis/basis.mlb
Is there some standard universal "include this file" command, and if it doesn't exist, is there some other way to have all compilers read from one entry-point file?
P.S. Wouldn't mind someone going on a small rant about compiler differences. I'm always interested in what people think and there's not too much info available :-)
The use function is the standard universal "include this file" command,
included in the Top-level environment
val use : string -> unit implementation dependent
I generally maintain the build environment in smlnj's CM,
then convert to mlb with cm2mlb. It will define a flag MLton
when parsing the sources.cm file so that you can use that to work around differences in module loading behavior.
#if(defined(MLton))
runmain.sml
#endif
There is also a set of sml-buildscripts which converts from
mlb to polyml. I am not familiar with them nor polyml however
CM is convenient as the authoritative source, since it provides programmatic access from SML via the structure CM.
This is what cm2mlb uses, So while i'm not aware of anything which exists already that converts from CM to polyml, it should be possible.
I see these hex strings which do not seem to belong to any dll in the call stack of visual studio:
000000001665b7e0()
0000000000000935()
0000094500000001()
000000001665b9a4()
Normally I would see something like :
libabc.dll!myclass:myfunction() Line76
What do they imply and how do I make meaning of them ?
Those are indeed functions, but no one has left "breadcrumbs" your debugger can use to translate those addresses into a function name.
In this case, the mapping between 000000001665b7e0 and a function name is either in a symbol file which you do not have, a symbol file your debugger is unaware of, a symbol file your debugger is unable to read, or such a mapping does not exist.
What can you do about it?
Find the symbol information for this function and point your development system at it or ignore that you do not have this information.
The former is tricky because you have no clue what the function is. You may have to use a shotgun approach, add all the libraries, but you can reduce the scope of the search if you know what libraries your program uses.
The latter is a viable option because if you don't have access to the debugging information odds are pretty good you can't do anything about any bugs made by whoever wrote the code. Maybe you can write them a nasty e-mail. For an established library it's more likely there is no bug in the library, and your program is using the library incorrectly. Check the library documentation and debug your code first. When you have eliminated the possibility of errors in at your end, then start digging into the third-party code. With an established library there is often a core of developers who will be able to help confirm and resolve a library bug.
Why this happens:
The computer doesn't care what people call things. The computer only cares where things are in memory, so to make the smallest output file possible, the development system's (AKA "IDE", "compiler", and "tool chain" with varying degrees of accuracy) build tools typically strip out all of the stuff that's unnecessary to run the program. The nice, human-readable names sane programmers give functions, variables, classes, and what have you are among the first things to go.
The development system usually will allow you to preserve this address-symbol mapping to make debugging easier. As you've seen raw hex numbers aren't much use without some way to map them to recognizable terms you can use to look up documentation. Depending on the build system, this information may be left in the executable or library (resulting in a much larger output file) or it may come on the side as an optional symbol information file. These mapping files are often specific to the development system and are not readable by other development systems.
Most build systems, like autoconf/automake, allow the user to specify a target directory to install the various files needed to run a program. Usually this includes binaries, configuration files, auxilliary scripts, etc.
At the same time, many executables often need to read from a configuration file in order to allow a user to modify runtime settings.
Ultimately, a program (let's say, a compiled C or C++ program) needs to know where to look to read in a configuration file. A lot of times I will just hardcode the path as something like /etc/MYPROGAM/myprog.conf, which of course is not a great idea.
But in the autoconf world, the user might specify an install prefix, meaning that the C/C++ code needs to somehow be aware of this.
One solution would be to specify a C header file with a .in prefix, which simply is used to define the location of the config file, like:
const char* config_file_path = "#CONFIG_FILE_PATH#"; // `CONFIG_FILE_PATH` is defined in `configure.ac`.
This file would be named something like constants.h.in and it would have to be process by the configure.ac file to output an actual header file, which could then be included by whatever .c or .cpp files need it.
Is that the usual way this sort of thing is handled? It seems a bit cumbersome, so I wonder if there is a better solution.
There are basically two choices for how to handle this.
One choice is to do what you've mentioned -- compile the relevant paths into the resulting executable or library. Here it's worth noting that if files are installed in different sub-parts of the prefix, then each such thing needs its own compile-time path. That's because the user might specify --prefix separately from --bindir, separately from --libexecdir, etc. Another wrinkle here is that if there are multiple installed programs that refer to each other, then this process probably should take into account the program name transform (see docs on --program-transform-name and friends).
That's all if you want full generality of course.
The other approach is to have the program be relocatable at runtime. Many GNU projects (at least gdb and gcc) take this approach. The idea here is for the program to attempt to locate its data in the filesystem at runtime. In the projects I'm most familiar with, this is done with the libiberty function make_relative_prefix; but I'm sure there are other ways.
This approach is often touted as being nicer because it allows the program's install tree to be tared up and delivered to users; but in the days of distros it seems to me that it isn't as useful as it once was. I think the primary drawback of this approach is that it makes it very hard, if not impossible, to support both relocation and the full suite of configure install-time options.
Which one you pick depends, I think, on what your users want.
Also, to answer the above comment: I think changing the prefix between configure- and build time is not really supported, though it may work with some packages. Instead the usual way to handle this is either to require the choice at configure time, or to supported the somewhat more limited DESTDIR feature.
This is about programming in the large with SML. First a summary of what's seems to be available for that purpose, then a tiny summary, then finally, the simple question.
The use pseudo‑clause
Top-level type, exception, and value identifiers (standardml.org)
Note that the use function is special. Although
not defined precisely, its intended purpose is
to take the pathname of a file and treat the
contents of the file as SML source code typed
in by the user. It can be used as a simple build
mechanism, especially for interactive sessions.
Most implementations will provide a more sophisticated
build mechanism for larger collections of source
files. Implementations are not required to supply
a use function.
Then later
val use : string -> unit (* implementation dependent *)
Its drawbacks are: not supported by MLton at least, and while not standardized, seems to have the same behaviour with all major SML systems, which is to reload a unit as many times as a use is encountered for it, which is not OK due to the generative semantic of SML (defining a structure multiple times, will result into as much different definitions, which is especially wrong with types definitions).
ML Basis Files
There exist so called “ML Basis Files”: MLBasis (mlton.org) and ML‑Kit ML Basis Files (sourceforge.net).
The load pseudo‑clause
MoscowML has load which acts like use which uses only once, i.e. does not reload a unit if it's already loaded, which is what's expected to compose a system.
Summary
load is nice, but only recognized by MoscowML
MLBasis Files may be nice, but it's not recognized by neither Poly/ML nor Moscow ML
MLton does not recognize use
Putting everything in a single big bundle file, is the only one interoperable thing working with all compilers and interpreters; that works, but that quickly become a burden.
The question
Is there a known interoperable way to compose a system made of multiple SML source files?
One system you did not mention is SML/NJ's Compilation Manager (CM), which is quite powerful. And there are a few other, less known systems.
But that notwithstanding, the situation is indeed dire. There simply is no standardised separate compilation mechanism for SML. In practice that means that writing portable Makefiles or something alike is rather painful.
For HaMLet I went through that pain, in order to make it compile with 7 different SML implementations. The approach is to use a restricted (dependency-ordered) CM file and the necessary amount of make + sed hackery to generate meta files for other systems from that. It can also generate a file containing respective 'use' invocations for all the sources, for all other systems that at least support that. All in all it's not pretty, but works sufficiently well.
What packages do you use to handle command line options, settings and config files?
I'm looking for something that reads user-defined options from the command line and/or from config files.
The options (settings) should be dividable into different groups, so that I can pass different (subsets of) options to different objects in my code.
I know of boost::program_options, but I can't quite get used to the API. Are there light-weight alternatives?
(BTW, do you ever use a global options object in your code that can be read from anywhere? Or would you consider that evil?)
At Google, we use gflags. It doesn't do configuration files, but for flags, it's a lot less painful than using getopt.
#include <gflags/gflags.h>
DEFINE_string(server, "foo", "What server to connect to");
int main(int argc, char* argv[]) {
google::ParseCommandLineFlags(&argc, &argv, true);
if (!server.empty()) {
Connect(server);
}
}
You put the DEFINE_foo at the top of the file that needs to know the value of the flag. If other files also need to know the value, you use DECLARE_foo in them. There's also pretty good support for testing, so unit tests can set different flags independently.
For command lines and C++, I've been a fan of TCLAP: Templatized Command Line Argument Parser.
http://sourceforge.net/projects/tclap/
Well, you're not going to like my answer. I use boost::program_options. The interface takes some getting used to, but once you have it down, it's amazing. Just make sure to do boatloads of unit testing, because if you get the syntax wrong you will get runtime errors.
And, yes, I store them in a singleton object (read-only). I don't think it's evil in that case. It's one of the few cases I can think of where a singleton is acceptable.
If Boost is overkill for you, GNU Gengetopt is probably, too, but IMHO, it's a fun tool to mess around with.
And, I try to stay away from global options objects, I prefer to have each class read its own config. Besides the whole "Globals are evil" philosophy, it tends to end up becoming an ever-growing mess to have all of your configuration in one place, and also it's harder to tell what configuration variables are being used where. If you keep the configuration closer to where it's being used, it's more obvious what each one is for, and easier to keep clean.
(As to what I use, personally, for everything recently it's been a proprietary command line parsing library that somebody else at my company wrote, but that doesn't help you much, unfortunately)
I've been using TCLAP for a year or two now, but randomly I stumbled across ezOptionParser. ezOptionParser doesn't suffer from "it shouldn't have to be this complex"-syndrome the same way that other option parsers do.
I'm pretty impressed so far and I'll likely be using it going forward, specifically because it supports config files. TCLAP is a more sophisticated library, but the simplicity and extra features from ezOptionParser is very compelling.
Other perks from its website include (as of 0.2.0):
Pretty printing of parsed inputs for debugging.
Auto usage message creation in three layouts (aligned, interleaved or staggered).
Single header file implementation.
Dependent only on STL.
Arbitrary short and long option names (dash '-' or plus '+' prefixes not required).
Arbitrary argument list delimiters.
Multiple flag instances allowed.
Validation of required options, number of expected arguments per flag, datatype ranges, user defined ranges, membership in lists and case for string lists.
Validation criteria definable by strings or constants.
Multiple file import with comments.
Exports to file, either set options or all options including defaults when available.
Option parse index for order dependent contexts.
GNU getopt is pretty nice. If you want a C++ feel, consider getoptpp which is a wrapper around the native getopt.
As far as configuration file is concerned, you should try to make it as stupid as possible so that parsing is easy. If you are bit considerate, you might want to use yaac&lex but that would be really a big bucks for small apps.
I also would like to suggest that you should support both config files and command line options in your application. Config files are better for those options which are to be changed less frequently. Command-line options are good when you want to pass the immediate changing arguments (typically when you are creating a app, which would be called by some other program.)
If you are working with Visual Studio 2005 on x86 and x64 Windows there is some good Command Line Parsing utilities in the SimpleLibPlus library. I have used it and found it very useful.
Not sure about command line argument parsing. I have not needed very rich capabilities in that area and have generally rolled my own to save adding more dependencies to my software. Depending upon what your needs are you may or may not want to try this route. The C++ programs I have written are generally not invoked from the command line.
On the other hand, for a config file you really can't beat an XML based format. It's readable, extensible, structured, etc... :) Plus there are lots of XML parsers out there. Despite the fact it is a C library, I tend to use libxml2 from xmlsoft.org.
Try Apache Ant. Its primary usage is Java projects, but there isn't anything Java about it, and its usable for almost anything.
Usage is fairly simple and you've got a lot of community support too. It's really good at doing things the way you're asking.
As for global options in code, I think they're quite necessary and useful. Don't misuse them, though.