Smallest possible package for distributing MinGW with another program

Smallest possible package for distributing MinGW with another program - c++

I'm working on an open source programming language, and I want my users to be able to distribute standalone .exe files from their programs. My strategy is to have 3 components:
A DLL that contains the interpreter
A small .o object file (generated once from C) that invokes the DLL to start the execution
A generated .o file that contains a binary representation of the user's program, to be embedded as a binary blob with #2.
When the user requests an .exe, #2 and #3 are linked together, and the resulting executable can be distributed with #1. So far so good.
The problem I have now is that this means MinGW has to be bundled with the language, in order to do the linking step. I don't want to have my users manually download MinGW (my primary audience are children) and the standard MinGW distribution is more than a 100 megabytes, so bundling all of that would spoil the minimalism of my language's download (it's currently ~5 mb).
My question is: Is there a definitive list of files to be yanked from \MinGW and bundled with the language by themselves, that would make g++.exe work to link two .o files and the needed libraries together?
Alternative solutions are also welcome (for example a freely redistributable C++ compiler that's more easily bundled with other apps).

You could try to use Dependency Walker and rip g++ out of MinGW. It will generate your list of dependencies. Alternatively, you could use cygwin, which reduces your footprint to around 15 megs.

Related

Distribution of C++ application with dependencies in Visual Studio

I'm a junior programmer. I have developed a Visual Studio C++ project with a fair amount of dependencies: Boost, a fingerprint recognition library and Windows Biometrics Frameworks. As for today I know the Windows Biometric Framework can be downloaded from the standard Windows Update and I am not concerned about that, to my knowledge, the application is ready to search and link WBF dependencies on the computer by itself.
My concern is: which is the easiest (not most efficient, I need speed here) way to pack the executable file with all the resources and dependencies this .exe needs (Boost and the fingerprint recognition SDK) so that I can minimize distribution troubles, i.e this dll is missing, please reinstall the application, and things like that, without having to compile everything in the client's computer?
I've been able to see a couple ways here: copy the dlls listed in the project config, change to static linking... but I don't know if that is the simplest way. I have little to no trust in my abilities for this and those methods seem quite manual, wondering if there might be an automatic way for doing these things?

I'm not familiar with the fingerprint library or WBF, but most of Boost resides in headers so its compiled in when you compile your application. Some, like the threading library and system specific calls(e.g. getting CPU core count) are libraries that are statically linked to.
What format of the fingerprint library is provided? Dynamically, there would be at least a .dll with a corresponding import .lib file. Your application links statically to the importer after compiling, and binds to the library during run time. Or the library can be included in one large, single .lib that's linked to your application after its compiled. If you have both options available and you only want to distribute the binary file, use static linking.

Like in any systems, you will need to include every .dll libraries your app links and every external resources(images, config files, ...) your app uses. I usually make my Windows distributions by using http://www.jrsoftware.org/isinfo.php.
Very easy to use.

Distribute a program compiled with MinGW g++

Let's say I have created and compiled a simple program using the MinGW 64 (g++ compiler). Running this program on my computer and looking in Process Explorer for what DLL files the program is using I find (among many others):
libgcc_s_seh-1.dll
libstdc++6.dll
libwinpthread-1.dll
These are the only ones that reside under my MinGW installation folder. The rest of the DLL files used reside under C:\Windows.
Question 1:
Are the MinGW DLL files the MinGW C++ runtime libraries (so to speak)? Do they serve the same purpose as for example msvcrXXX.dll (XXX = version of Microsoft runtime library).
Question 2:
If I want to run the application on a different computer which does not have MinGW installed, is it sufficient to include those DLL files listed above (i.e. placing them in the same folder as my executable) to have it run on the other computer (we assume the other computer is also a 64-bit Windows machine). If yes, does this mean we basically ship the MinGW C++ runtime with our executable. If no, why?

libstdc++6.dll is the C++ standard library, like you said.
libwinpthread-1.dll is for C++11 threading support. MinGW-W64 has two possible thread variants: Either use the native Windows functions like CreateThread, but C++11 stuff like std::thread won´t be available then; or include this library and use the C++11 classes (too).
Note that to switch the thread model, you´ll need to reinstall MinGW. Just removing the DLL and not using the C++11 stuff won´t work, the DLL will be required nonetheless with your current install.
libgcc_s_seh-1.dll is something about C++ exception handling.
Yes, it should be sufficient to deliver the DLLs too
(or use static linking and deliver only your program file).

For complicated projects where you're not exactly sure which DLL files need to be included to distribute your application, I made a handy dandy Bash script (for MSYS2 shells) that can tell you exactly what DLL files you need to include. It relies on the Dependency Walker binary.
#!/usr/bin/sh
depends_bin="depends.exe"
target="./build/main.exe" # Or wherever your binary is
temp_file=$(mktemp)
output="dll_list.txt"
MSYS2_ARG_CONV_EXCL="*" `cygpath -w $depends_bin` /c /oc:`cygpath -w $temp_file` `cygpath -w $target`
cat $temp_file | cut -d , -f 2 | grep mingw32 > $output
rm $temp_file
Note that this script would need to be modified slightly for use in regular MSYS (the MSYS2_ARG_CONV_EXCL and cygpath directives in particular). This script also assumes your MinGW DLL files are located in a path which contains MinGW.
You could potentially even use this script to automatically copy the DLL files in question into your build directory as part of an automatic deploy system.

You may like to add the options -static-libgcc and -static-libstdc++ to link the C and C++ standard libraries statically and thus remove the need to carry around any separate copies of those.

I used ntldd to get a list of dependencies.
https://github.com/LRN/ntldd
I'm using msys2 so i just installed it with pacman. Use that and then copy all the needed dependencies

There are several major challenges to distributing compiled software:
Compiling the code for all target processors (remember, when it comes to compiled code, you need to produce separate downloads/distributions for each type of instruction set architecture).
Ensuring that the builds are reproducible, consistent, and can be easily correlated with a specific version of the code (and versions of the dependencies).
Ensuring that the build output is self-contained and includes all of its dependencies within it (so that it is not dependent on any other installations that happen to exist on just your system).
Making sure that your code is built and distributed regularly, with updates distributed automatically so that -- in the event of security issues -- you can push out new patched versions.
For convenience and to increase reach, it is nice for non-savvy users to have a prebuilt version that they can install. However, I would recommend sharing the source code as a first step.
Most of these requirements are fairly non-trivial to hit and often require automating not only build process, but also automating the instantiation / configuration of VMs in which the build should take place. However, there are open source projects that can help... for example, check out Gitian.
In terms of bullet point #3, the key thing here is to use static linking... while this does make the binary you distribute much larger (because its dependencies are now baked into the output), it also makes your binary isolated from the version of the libraries on the system (avoiding "dependency hell").
Point #4 is very tricky, but thankfully there are also opensource tools to help here, as well such as cloudup, which provides a way to add auto-updating capability to your application distribution.

wxWidgets: Which files to link?

I am learning C++ and, in order to do so, am writing a sample wxWidgets application.
However, none of the documentation I can find on the wxWidgets website tell me what library names to pass to the linker.
Now, apart from wxWidgets, is there a general rule of thumb or some other convention by which I should/would know, for any library, what the names of the object files are against which I am linking?

We have more of a "rule of ring finger", instead of a thumb
Generally, if you compile the library by hand, it will produce several library files (usually .a .lib or something similar, depending entirely on your compiler and your ./configure) these are produced (typically) because of a makefile's build script.
Now a makefile can be edited in any way the developer pleases, but there are some good conventions (there is, in fact, a standard) many follow- and there are tools to auto generate the make files for the library (see automake)
Makefiles are usually consistent
You can simply use the makefile to generate the files, and if it's compliant, the files will be placed in a particular folder (the lib folder I believe?) all queued up and ready to use!
Keep in mind, a library file is simply the implementation of your code in precompiled format, you could create a library yourself from your code quite easily using the ar tool. Because it is code, just like any other code, you don't necessarily want to include all of the library files for a given library. For instance with wxWidgets if you're not using rich text, you certainly don't want to waste the extra space in your end executable by including that library file. Later if you want to use it, you can add it to your project (just like adding a .cpp file)
Oh and also with wxWidgets, in their (fantastic) documentation, each module will say what header you need to include, and what library it is a part of.
Happiness
Libraries are amazing, magical, unicorns of happiness. Just try not to get too frustrated with them and they'll prance in the field of your imagination for the rest of your programming career!

After a bit more Googling, I found a page on the wxWidgets wiki which relates to the Code::Blocks IDE, but which also works for me. By adding the following to the linker options, it picks up all the necessary files to link:
`wx-config --libs`
(So that does not solve my "general rule" problem; for any library I am working with, I still have to find out what files to link against, but at least this solves the problem for wxWidgets).

The build instructions are different for each platform and so you need to refer to the platform-specific files such as docs/gtk/install.txt (which mentions wx-config) or docs/msw/install.txt to find them.
FWIW wxWidgets project would also definitely gratefully accept any patches to the main manual improving the organization of the docs.

Finding all libraries and header files forming a C++ executable

If I have a C++ source file, gcc can give all its dependencies, in a tree structure, using the -H option. But given only the C++ executable, is it possible to find all libraries and header files that went into its compilation and linking?

If you've compiled the executable with debugging symbols, then yes, you can use the symbols to get the files.
If you have .pdb files (Visual studio creates them to store sebugging information separately) you can use all kinds of programs to open them and see the source files and methods.
You can even open it with a text editor and you'll see, among the gibrish, a list of the functions and source files.
If you're using linux (or GNU compilers in general), you can use gdb (again only if you have debug symbols enables in compilation time).
Run gdb on your executable, then run the command: info sources
That's an important reason why you should always remove that flag when going into production. You don't want clients to mess around with your sources, functions, and code.

You cannot do that, because that executable might have been build on a machine on which the header files (or the C++ code, or the libraries) are private or even generated. Also, if a static library is linked in, you have no reliable way to find out.
In practice however, on Linux, using nm or objdump or ldd on the executable will often (but not always) gives you a good clue about the needed libraries.
Also, some executables are dynamically loading a plugin e.g. using dlopen, so your question might not have any sense (since that plugin is known only at runtime).
Notice also that you might not know if an executable is obtained by compiling some C++ code (you might not be able to tell if it was obtained from C, C++, D, or Ocaml, ... source code, or a mixture of them).
On Linux, if you build an executable with static linking and stripping, people won't be able to easily guess the source programming language that you have used.
BTW, on Linux distributions, it is the role of the package management system to deal with such dependencies.
As answered by Yochai Timmer if the executable contains debug information (e.g. in DWARF format) you should be able to get a lot more information.

How to add a folder in the header and how does .a works?

I am currently working in Ubuntu Linux. I am working with
a .hpp file and .cpp file. From these two I am
creating an .a file (like a dll in order to use and work with
my application on any computer that has linux installed).
I mention the fact that both .hpp and.cpp are in folder 1.
I would like to ask :
If I include in .cpp a header from a folder like:
#include "/home/tests/folder1/folder2/header.h"
will this work correctly after i create the .a using ar rcs and send my .a on another computer?
does the path to a specific header from a folder influence the .a created?
If I had to download for example gsoap in order
to accomplish my task, after I've created
the .a file that contains a lot of .xml and .cpp/.h files
from gsoap and from my .h and .cpp file do I need
to create a makefile in order to download gsoap
on the computer where I want to use my .a (or dll in windows)
application?

"Any computer that has linux installed" isn't going to work. Linux encases a wide variety of platforms and architectures, unlike Windows which generally encases only two (and the 64-bit versions are backward compatible with the 32-bit versions, so the .dlls always work).
As already mentioned elsewhere, a .a is a static library and is equivalent to a Windows .lib, not a .dll. The Linux equivalent to a Windows .dll is a .so "shared object".
No. #includes are resolved by the pre-processor in a step prior to compilation. The contents of the file are literally inserted into the copy of your source file in memory, then the whole lot is compiled. The string with the folder does not exist in your actual compiled module.That said, writing absolute paths is very bad form. It means you cannot move your development environment/directory anywhere. Use relative paths: they should be relative to your current working directory and/or to your defined include path. Read your toolchain's documentation for more information.
If you statically link gsoap, then you don't have to do anything. It's compiled into your project.If you want to dynamically link it, then your .a should not contain any .cpp files from gsoap. The target computer must have gsoap shared libraries installed, and this will be a required dependency that your installer or your user must resolve.Makefiles do not download dependencies. Package managers do.

Actually, a ".a" file is an archive file. Linux chose that format for its library files, so you can compare it to ".lib" (".so" is the rough equivalent of ".dll").
There are a number of stages of compilation: preprocessing, compiling, assembling and linking.
Preprocessing effectively answers your first question because code in the .h/.hpp file is inserted into the .cpp file, meaning that when your code is compiled, all code that is necessary to compile the .cpp file successfully is in that one file.
Compiling turns your code into assembly instructions for the specific computer that you're using. This means that if your code was built to run on a PowerPC computer (Mac), your code would use machine instructions that any PPC computer could use (meaning that Intel, AMD, SPARC, Alpha, etc. computers couldn't use your code). This answers your question about moving a ".a" file to another computer - you can use it as long as the computer's processor AND operating system is compatible (you may have a 64-bit processor, but that doesn't mean 32-bit Windows will let you use it to its full capacity).
Assembling converts the primitive text-based assembly instructions into machine instructions that the processor can understand. This creates an object file (.obj on Windows, .o on Linux). This file is what goes in the library (.lib on Windows, .a on Linux). There are other names for machine instructions such as "machine code" and "object code", and any one of them can be used to describe the same thing.
Linking is the last stage. It takes the necessary code from libraries and the various necessary object files and turns them into an actual binary (.exe file on Windows, Linux doesn't need an extension because of how it is designed). This is your application.
Because linking is the last stage, the gsoap library (for example) must be specified in addition to your library or else the linker will say it couldn't find certain "symbols". However, as with your library, the gsoap library must be on that computer to be able to use it. Installing it with the package manager is preferred when possible, but if you can't do that, you need to compile it on that computer. If you're moving from a PPC computer to an Intel/AMD computer, you would also need to re-compile your library as well as gsoap (if you couldn't install gsoap via package manager).

"does the path to a specific header from a folder influence the .a created?" - Only may be debug information. Nothing that would prevent it from working if you copy it to another place.
*.a is a static library. It is like *.lib in windows - not like *.dll
You can move any static and dynamic libraries (*.a/*.so in linux, *.lib/*.dll in windows) to any folder/computer/planet you like and use it there while dependencies are satisfied (available all the necessary dynamic and static libs, software, hardware that your library depend on). Of course running the code using your library will require the CPU architecture you've compiled for and all dynamic libs your code uses directly or indirectly.
Not directly related to the question asked: don't #include files by absolute paths. Ever. Define and use include directories. It is a matter of style and readability. Includes like "/home/user/working_dir/blabla.h" or "D:/working_dir/blabla.h" or "..\..\some\directory\blabla.h" is ugly and unmaintainable. Includes like <blabla.h> or <blabla/defs.h> is perfect for library APIs and like "blabla.h" or "subdir/blabla.h" is OK for internal headers.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js