I'm compiling Linux libraries (for Android, using NDK's g++, but I bet my question makes sense for any Linux system). When delivering those libraries to partners, I need to mark them with a version number. I must also be able to access the version number programatically (to show it in an "About" dialog or a GetVersion function for instance).
I first compile the libraries with an unversioned flag (version 0.0) and need to change this version to a real one when I'm done testing just before sending it to the partner. I know it would be easier to modify the source and recompile, but we don't want to do that (because we should then test everything again if we recompile the code, we feel like it would be less error prone, see comments to this post and finally because our development environment works this way: we do this process for Windows binaries: we set a 0.0 resources version string (.rc) and we later change it by using verpatch...we'd like to work with the same kind of process when shipping Linux binaries).
What would be the best strategy here?
To summarize, requirements are:
Compile binaries with "unset" version (0.0 or anything else)
Be able to modify this "unset" version to a specific one without having to recompile the binary (ideally, run a 3rd party tool command, as we do with verpatch under Windows)
Be able to have the library code retrieve it's version information at runtime
If your answer is "rename the .so", then please provide a solution for 3.: how to retrieve version name (i.e.: file name) at runtime.
I was thinking of some solutions but have no idea if they could work and how to achieve them.
Have a version variable (one string or 3 int) in the code and have a way to change it in the binary file later? Using a binary sed...?
Have a version variable within a resource and have a way to change it in the binary file later? (as we do for win32/win64)
Use a field of the .so (like SONAME) dedicated to this and have a tool allowing to change it...and make it accessible from C++ code.
Rename the lib + change SONAME (did not find how this can be achieved)...and find a way to retrieve it from C++ code.
...
Note that we use QtCreator to compile the Android .so files, but they may not rely on Qt. So using Qt resources is not an ideal solution.
I am afraid you started to solve your problem from the end. First of all SONAME is provided at link time as a parameter of linker, so in the beginning you need to find a way to get version from source and pass to the linker. One of the possible solutions - use ident utility and supply a version string in your binary, for example:
const char version[] = "$Revision:1.2$"
this string should appear in binary and ident utility will detect it. Or you can parse source file directly with grep or something alike instead. If there is possibility of conflicts put additional marker, that you can use later to detect this string, for example:
const char version[] = "VERSION_1.2_VERSION"
So you detect version number either from source file or from .o file and just pass it to linker. This should work.
As for debug version to have version 0.0 it is easy - just avoid detection when you build debug and just use 0.0 as version unconditionally.
For 3rd party build system I would recommend to use cmake, but this is just my personal preference. Solution can be easily implemented in standard Makefile as well. I am not sure about qmake though.
Discussion with Slava made me realize that any const char* was actually visible in the binary file and could then be easily patched to anything else.
So here is a nice way to fix my own problem:
Create a library with:
a definition of const char version[] = "VERSIONSTRING:00000.00000.00000.00000"; (we need it long enough as we can later safely modify the binary file content but not extend it...)
a GetVersion function that would clean the version variable above (remove VERSIONSTRING: and useless 0). It would return:
0.0 if version is VERSIONSTRING:00000.00000.00000.00000
2.3 if version is VERSIONSTRING:00002.00003.00000.00000
2.3.40 if version is VERSIONSTRING:00002.00003.00040.00000
...
Compile the library, let's name it mylib.so
Load it from a program, ask its version (call GetVersion), it returns 0.0, no surprise
Create a little program (did it in C++, but could be done in Python or any other languauge) that will:
load a whole binary file content in memory (using std::fstream with std::ios_base::binary)
find VERSIONSTRING:00000.00000.00000.00000 in it
confirms it appears once only (to be sure we don't modify something we did not mean to, that's why I prefix the string with VERSIONSTRING, to make it more unic...)
patch it to VERSIONSTRING:00002.00003.00040.00000 if expected binary number is 2.3.40
save the binary file back from patched content
Patch mylib.so using the above tool (requesting version 2.3 for instance)
Run the same program as step 3., it now reports 2.3!
No recompilation nor linking, you patched the binary version!
Related
I'm using arm-none-linux-gnueabi-g++ in order to compile a c++ code that will run on an embedded Linux device.
I'm using the arm-none-linux-gnueabi-g++ under windows and get as output the binary file that will run on the Linux machine.
In order to set the embedded device with a new binary, I need to create an archive file (zip) with the binary file and with some more settings files.
till far it all OK.
I need to automate that so that the archive file will be created automatically at name of the version of the binary file.
Currently, we keep the version as just a simple constant std::string variable in the code. We use that string when printing diagnostic, logging, etc.
How can I read that from the version binary file?
Or may other methods to achieve that goal?
I thought may to store it in some constant place in the binary file and read it from there but really don't know how to do that without making the binary corrupted.
You are creating the file automatically, so I assume you are first compiling it and then making an archive with the resulting binary.
You could store the version in a text file, and #include that file in your code:
const std::string version =
#include "version.txt"
;
In the version.txt:
"version string"
And when making the archive, you can easily parse the version from the text file.
Ville is correct.
You're currently doing it backwards!
Your build system should provide the version to the executable, not the other way around. Once this is fixed, your build system can provide the same version to other elements, such as your ZIP filename.
Ideally the version would be generated from version control autonomously, but you could specify it in the build command if really necessary.
It's possible to pull some string from the binary (think nm, if there's a Windows equivalent), but that's really the reverse way to do it.
I have seen one other answer link but what I don't understand is what is basis.cm and what's it's use?
You are asking two questions.
What is basis.cm and what's it's use?
This is the Basis library. It allows the use of built-in functions.
How to compile and execute a stand-alone SML-NJ executable
Assuming you followed Jesper Reenberg's tutorial on how to execute a heap image, the next thing you need in order to have SML/NJ produce a stand-alone executable is to convert this heap image. One should hypothetically be able to do this using heap2exec, a tool that takes the heap image, e.g. the .x86-linux file generated on my system, and generates an .asm file that can be assembled and linked.
Unfortunately, this tool is not very well-maintained, so you have to
Go to the smlnj.org page and fix the download-link by removing 'www.' (this page and the SourceForge page don't contain the same explanations or assumptions about argument count, and neither page's download link work).
Download and extract this tool, and fix the 'build' script so it points to your ml-build tool
Fix the tool's argument use by changing [inf, outf] to [_, inf, outf]
Run ./build which generates 'heap2asm.x86-linux' on my system
For example, in order to generate an .asm file for the heap2asm program itself, run
sml #SMLload heap2asm.x86-linux heap2asm.x86-linux heap2asm.s
At this point, I have unfortunately been unable to produce an executable that works. E.g. if you run gcc -c heap2asm.s and ld heap2asm.o, you get a warning of a missing _start label. The resulting executable segfaults even if you rename the existing _sml_heap_image label to _start. That is, it seems that a piece of entry code that the runtime environment normally delivers is missing here.
At this point, discard SML/NJ and use MLton for producing stand-alone binaries.
I work in a very regulated environment where we need to be able to produce identical binary input give the same source code every time be build out products. We currently use an ancient version of g++ that has been patched to not write anything like a date/time in the resulting binaries that would change from build to build, but I would like to update to g++ 4.7.2. Does anyone know of a patch, or have suggestions of what I need to look for to take two identical pieces of source code and produce identical binary outputs?
The Debian Reproducible builds project attempts to standardize Debian packages byte-by-byte, and has received a Linux Foundation grant in 2016.
While this may include more than compilation, you should have a look at it.
It also pointed me to this article, which adds the following points to what #Employed said:
put the source in a fixed folder (e.g. /tmp/build) to deal with __FILE__
for __DATE__, __TIME__, __TIMESTAMP__:
libfaketime : https://github.com/wolfcw/libfaketime
override those macros with -D
-Wdate-time or -Werror=date-time: warn or fail if either __TIME__, __DATE__ or __TIMESTAMP__ are is used. The Linux kernel 4.4 uses it by default.
use the D flag with ar, or use https://github.com/nh2/ar-timestamp-wiper/tree/master to wipe stamps
-fno-guess-branch-probability: older manual versions say it is a source of non-determinism, but not anymore. Not sure if this is covered by -frandom-seed or not.
Buildroot has a BR2_REPRODUCIBLE option which may give some ideas on the package level, but it is far from complete at this point.
Related threads:
https://superuser.com/questions/639351/does-recompiling-a-program-produce-a-bit-for-bit-identical-binary
https://www.quora.com/What-can-be-the-possible-reasons-for-the-object-code-of-an-unchanged-C-file-to-change-on-recompilation
We also depend on bit-identical rebuilds, and are using gcc-4.7.x.
Besides setting PWD=/proc/self/cwd and using -frandom-seed=<input-file-name>, there are a handful of patches, which can be found in svn://gcc.gnu.org/svn/gcc/branches/google/gcc-4_7 branch.
Use of the 'DATE' macro makes the build non-deterministic
My environment is Linux CentOS 6.2. And I've a source control system like svn/hg/git etc. My source code is C/C++.
I want to check in the build binary to keep which binary is release to customer.
And I assume build binary's checksum will different when source code changed.
So, I could reverse trace which binary is build from which version.
Is it possible, what's the tricks I must follow?
I've seen some executable display the revision when execute with -version option.
But I'm wonder how to prevent write wrong -version string into the executable.
If I keep a md5.txt and check-in it instead of check in binary.
How could I make sure I can build the same md5 executable again?
Sorry, for clearing my question and preventing another unexpected answer, I prefer a answer like:
Keep a md5sum.txt in scm when release a new version to user.
Keep binary separate from your SCM.
To rebuild the same md5sum binary you should make sure
write symbol into binary when make(eg. by -DVERSION="1.x")
show the VERSION string to user
remove all $Id, that let your SCM run slower.
keep same CPU & OS & compiler & library environment
...
Create strings within a .cpp file as thus:
static const char version[] = "#(#) $Id$";
where $Id$ is obtained from SVN
Use the what command (see the manual page). It will obtain these strings from the binary so you can check.
Is this an executable or a shared library? If the latter, you could export a function that would return the version (number, string, your choice). Then dlopen(), dlsym(), and execute the function.
For executable ELF binaries, you might be able to implant some data in the binary that can be queried using the 'nm' utility.
If you'll use Subversion, SvnRev will do most work for you (no md5 in repos, repo hold sources, binary - resource with revision-id)
For Mercurial, you can get idea for version sting from VersioningWithMake wiki, and in order to get string like result of git describe, instead of simple template {node|short} for HGVERSION you can use something as {latesttag}+{latesttagdistance}:{node|short}, showing (example) 1.3+11:8a226f0f99aa
When I compile the release version of my iOS app (based on standard Apple supplied iOS app template), look into the resulting executable binary, I see all sorts of symbols and even local cpp source and header paths in there. I'm really stumped why this is (I haven't enabled RTTI*). Especially the source file paths make me feel uncomfortable sending this app across the globe (why should everyone be able to see the directory layout of my development machine?).
Here's are two (randomly picked, moderated) excerpts:
TS/../ACTORS/CActorCanvasCharPart.cpplastMeshcapVerticesOFF BOUNDSupload VERTICES: %d
20CActorCanvasCharPartgrassscrub/Volumes/Data/iOS_projects/code/MyAppName_proj/MyAppName/source/STATES/GAMES/2/CStateGame2_grass.cppbaseShadowmowerstartmowerloopmowermowerCharcutGrassChargrassStuffgrassParticles/Volumes/Data/iOS_projects/code/MyAppName_proj/MyAppName/source/STATES/GAMES/2/CStateGame2_grass.h17CStateGame2_grasssinwriteStroke/Volumes/Data/iOS_projects/code/MyAppName_proj/MyAppName/source/STATES/GAMES/2/CStateGame2_flowers.hflowerBedsandTrailclickstart3inplace2sandDrag/Volumes/Data/iOS_projects/code/MyAppName_proj/MyAppName/source/STATES/GAMES/2/CStateGame
And here are a lot of symbols for self-defined types and structs:
CAssetMgr="_vptr$CMgrBase"^^?"pMain"^{CMain}"inited"B"curveCount"S"curveSpecs"^{CCurveSpec}"gameSpecs"[23{CGameStateSpec="header"{SpecDiskHeader="type"i"version"S}"gameID"C"backgroundColor"{CRGBAcolorf="r"f"g"f"b"f"a"f}"clickPointColor"{CRGBAcolorf="r"f"g"f"b"f"a"f}"clickPointIconColor"{CRGBAcolorf="r"f"g"f"b"f"a"f}"hintColor"{CRGBAcolorf="r"f"g"f"b"f"a"f}}]"currentFont"^{CCharset}"userCharParts"^^{CCharPart}"words"{CDataSet<CName4,CCharArray>="_vptr$CObjectBase"^^?"pMain"^{CMain}"count"i"data"*"dataSize"l}"sets"{CDataSet<CName16,CCharArray>="_vptr$CObjectBase"^^?"pMain"^{CMain}"count"i"data"*"dataSize"l
Can this be avoided, how?
*UPDATE: I just found out that RTTI is on by default. So I cleaned the target, disabled RTTI (GCC_ENABLE_CPP_RTTI = NO) and recompiled. I still see a lot of symbols and source paths in the binary.
UPDATE 2: I checked a few other apps from the app store, and many of them also have their source file paths show up. Pretty scary, if you ask me:
Joined Up Lite
/Users/lloydy/Documents/Development/iPhone/ABC Joined Up/main.m
/Users/lloydy/Documents/Development/iPhone/ABC Joined Up/Classes/SettingsView.m
Crayon Physics
/Users/smproot/Desktop/unzip/CrayonPhysics/v104/Classes/crayon/src/ceng/gameutils/killspriteslowly/killspriteslowly.cpp
/Users/smproot/Desktop/unzip/CrayonPhysics/v104/Classes/crayon/src/ceng/tasks/task/sdl/mixer/ctaskaudiosdlmixer.cpp
Wall Times
/Users/fred/_WORK/ZDNDRP/WallTimes/main.m
/Users/fred/_WORK/ZDNDRP/WallTimes/Classes/SystemCategories.m
Jumbo Calculator
/Users/Christopher/Documents/Development/JumboCalculator 1.0.3/main.m
/Users/Christopher/Documents/Development/JumboCalculator 1.0.3/Classes/CalculatorFaceViewController.m
The file paths are most likely from assert macros which stringify __FILE__ as part of their failure message. iOS's implementation of assert(3) does this, as do the NSAssert macros.
You can remove asserts in release builds by defining NDEBUG (for the C asserts) and NS_BLOCK_ASSERTIONS (for NSAsserts).
In Xcode set Deployment Prostprocessing to Yes in order to trigger Xcode to call the strip command during build process. Then you don't see any source path via nm -a.
However, I still see the source paths of some m files via the strings command :/
What worked for me was setting Generate Debug Symbols to No for release builds. This is under the Apple LLVM 7.0 - Code Generation in Xcode 7.2.
Have ticked the strip debug symbols in the build settings? You can do this (or not) depending on the configuration (build/release). Also you can look into Objective-C Code Obfuscation (which is long winded). From what I gather, you cannot completely remove objective-c information as all method calls are done dynamically, so the library has to have information about your classes/method names in order to function. A useful tip here.
If you have c++ code then you can use the gcc strip utility, although I'm not sure how it like Objetive-C++, if it doesn't you could compile all you cpp into a lib, strip that and link against it in your iOS project.