How to store and retrieve data from linux binary file - c++

I'm using arm-none-linux-gnueabi-g++ in order to compile a c++ code that will run on an embedded Linux device.
I'm using the arm-none-linux-gnueabi-g++ under windows and get as output the binary file that will run on the Linux machine.
In order to set the embedded device with a new binary, I need to create an archive file (zip) with the binary file and with some more settings files.
till far it all OK.
I need to automate that so that the archive file will be created automatically at name of the version of the binary file.
Currently, we keep the version as just a simple constant std::string variable in the code. We use that string when printing diagnostic, logging, etc.
How can I read that from the version binary file?
Or may other methods to achieve that goal?
I thought may to store it in some constant place in the binary file and read it from there but really don't know how to do that without making the binary corrupted.

You are creating the file automatically, so I assume you are first compiling it and then making an archive with the resulting binary.
You could store the version in a text file, and #include that file in your code:
const std::string version =
#include "version.txt"
;
In the version.txt:
"version string"
And when making the archive, you can easily parse the version from the text file.

Ville is correct.
You're currently doing it backwards!
Your build system should provide the version to the executable, not the other way around. Once this is fixed, your build system can provide the same version to other elements, such as your ZIP filename.
Ideally the version would be generated from version control autonomously, but you could specify it in the build command if really necessary.
It's possible to pull some string from the binary (think nm, if there's a Windows equivalent), but that's really the reverse way to do it.

Related

Change Linux shared library (.so file) version after it was compiled

I'm compiling Linux libraries (for Android, using NDK's g++, but I bet my question makes sense for any Linux system). When delivering those libraries to partners, I need to mark them with a version number. I must also be able to access the version number programatically (to show it in an "About" dialog or a GetVersion function for instance).
I first compile the libraries with an unversioned flag (version 0.0) and need to change this version to a real one when I'm done testing just before sending it to the partner. I know it would be easier to modify the source and recompile, but we don't want to do that (because we should then test everything again if we recompile the code, we feel like it would be less error prone, see comments to this post and finally because our development environment works this way: we do this process for Windows binaries: we set a 0.0 resources version string (.rc) and we later change it by using verpatch...we'd like to work with the same kind of process when shipping Linux binaries).
What would be the best strategy here?
To summarize, requirements are:
Compile binaries with "unset" version (0.0 or anything else)
Be able to modify this "unset" version to a specific one without having to recompile the binary (ideally, run a 3rd party tool command, as we do with verpatch under Windows)
Be able to have the library code retrieve it's version information at runtime
If your answer is "rename the .so", then please provide a solution for 3.: how to retrieve version name (i.e.: file name) at runtime.
I was thinking of some solutions but have no idea if they could work and how to achieve them.
Have a version variable (one string or 3 int) in the code and have a way to change it in the binary file later? Using a binary sed...?
Have a version variable within a resource and have a way to change it in the binary file later? (as we do for win32/win64)
Use a field of the .so (like SONAME) dedicated to this and have a tool allowing to change it...and make it accessible from C++ code.
Rename the lib + change SONAME (did not find how this can be achieved)...and find a way to retrieve it from C++ code.
...
Note that we use QtCreator to compile the Android .so files, but they may not rely on Qt. So using Qt resources is not an ideal solution.
I am afraid you started to solve your problem from the end. First of all SONAME is provided at link time as a parameter of linker, so in the beginning you need to find a way to get version from source and pass to the linker. One of the possible solutions - use ident utility and supply a version string in your binary, for example:
const char version[] = "$Revision:1.2$"
this string should appear in binary and ident utility will detect it. Or you can parse source file directly with grep or something alike instead. If there is possibility of conflicts put additional marker, that you can use later to detect this string, for example:
const char version[] = "VERSION_1.2_VERSION"
So you detect version number either from source file or from .o file and just pass it to linker. This should work.
As for debug version to have version 0.0 it is easy - just avoid detection when you build debug and just use 0.0 as version unconditionally.
For 3rd party build system I would recommend to use cmake, but this is just my personal preference. Solution can be easily implemented in standard Makefile as well. I am not sure about qmake though.
Discussion with Slava made me realize that any const char* was actually visible in the binary file and could then be easily patched to anything else.
So here is a nice way to fix my own problem:
Create a library with:
a definition of const char version[] = "VERSIONSTRING:00000.00000.00000.00000"; (we need it long enough as we can later safely modify the binary file content but not extend it...)
a GetVersion function that would clean the version variable above (remove VERSIONSTRING: and useless 0). It would return:
0.0 if version is VERSIONSTRING:00000.00000.00000.00000
2.3 if version is VERSIONSTRING:00002.00003.00000.00000
2.3.40 if version is VERSIONSTRING:00002.00003.00040.00000
...
Compile the library, let's name it mylib.so
Load it from a program, ask its version (call GetVersion), it returns 0.0, no surprise
Create a little program (did it in C++, but could be done in Python or any other languauge) that will:
load a whole binary file content in memory (using std::fstream with std::ios_base::binary)
find VERSIONSTRING:00000.00000.00000.00000 in it
confirms it appears once only (to be sure we don't modify something we did not mean to, that's why I prefix the string with VERSIONSTRING, to make it more unic...)
patch it to VERSIONSTRING:00002.00003.00040.00000 if expected binary number is 2.3.40
save the binary file back from patched content
Patch mylib.so using the above tool (requesting version 2.3 for instance)
Run the same program as step 3., it now reports 2.3!
No recompilation nor linking, you patched the binary version!

why I am getting "invalid command name "MZ"" on loading a dll on wish console?

I have a library and I have generated tcl bindings for the same using swig. The dll thus generated is xyz_tcl.dll if my original lib dll us xyz.dll. but when I try to load the dll its says "invalid command name "MZ"". Can any one tell me what could be reason for it.
The MZ is almost certainly the first few bytes of the DLL (it's the “magic number” of the file format) so at a guess you're trying to do:
source xyz_tcl.dll
That won't work. It contains compiled C code that integrates with Tcl, but not a Tcl script. Instead, you need to do:
load xyz_tcl.dll
Of course, it should be build into a package (which is a directory containing the required DLLs and a file pkgIndex.tcl) which would then let you do something like this instead:
package require xyz
(The pkgIndex.tcl file contains instructions on how to define the package using the other files, through load and source as necessary.)
I think that something (tcl?) is trying to execute the DLL as a script - the first two bytes of a Windows executable file are 'M' and 'Z'.
For historical reasons, every Win32 executable has a small 16-bit MS-DOS header just before the actual Win32 PE header, and the signature bytes for the 16-bit header are "MZ".

How could I query binary's source code version

My environment is Linux CentOS 6.2. And I've a source control system like svn/hg/git etc. My source code is C/C++.
I want to check in the build binary to keep which binary is release to customer.
And I assume build binary's checksum will different when source code changed.
So, I could reverse trace which binary is build from which version.
Is it possible, what's the tricks I must follow?
I've seen some executable display the revision when execute with -version option.
But I'm wonder how to prevent write wrong -version string into the executable.
If I keep a md5.txt and check-in it instead of check in binary.
How could I make sure I can build the same md5 executable again?
Sorry, for clearing my question and preventing another unexpected answer, I prefer a answer like:
Keep a md5sum.txt in scm when release a new version to user.
Keep binary separate from your SCM.
To rebuild the same md5sum binary you should make sure
write symbol into binary when make(eg. by -DVERSION="1.x")
show the VERSION string to user
remove all $Id, that let your SCM run slower.
keep same CPU & OS & compiler & library environment
...
Create strings within a .cpp file as thus:
static const char version[] = "#(#) $Id$";
where $Id$ is obtained from SVN
Use the what command (see the manual page). It will obtain these strings from the binary so you can check.
Is this an executable or a shared library? If the latter, you could export a function that would return the version (number, string, your choice). Then dlopen(), dlsym(), and execute the function.
For executable ELF binaries, you might be able to implant some data in the binary that can be queried using the 'nm' utility.
If you'll use Subversion, SvnRev will do most work for you (no md5 in repos, repo hold sources, binary - resource with revision-id)
For Mercurial, you can get idea for version sting from VersioningWithMake wiki, and in order to get string like result of git describe, instead of simple template {node|short} for HGVERSION you can use something as {latesttag}+{latesttagdistance}:{node|short}, showing (example) 1.3+11:8a226f0f99aa

How can I load an external file/program in memory and then execute it (C++ and Unix)?

Let's say I have an executable file called "execfile". I want to read that file using a C++ program (on Unix) and then execute it from memory. I know this is possible on Windows but I was not able to find a solution for Unix.
In pseudo-code it would be something like this:
declare buffer (char *)
readfile "execfile" in buffer
execute buffer
Just to make it clear: obviously I could just execute the file using system("execfile"), but, as I said, this is not what I intend to do.
Thank you.
EDIT 1: To make it even more clear (and the reason why I can't use dlopen): the reason I need this functionality is because the executable files are going to be generated dynamically and so I cannot just build all of them at once in a single library. To be more precise I'm building a tool that will first encrypt an executable file with a key and then it will be able to execute that encrypted file, first decrypting it and then executing it (and I don't want to have a copy of the decrypted file on the file system).
You cannot without writing a mountain of code. Loading and linking an a.out is a kernel facility, not a user mode facility, on linux.
You'd be better off making a shared library and loading it with dlopen.
The solution to load-and-run -- not necessarily in C++ -- is to use dlopen+dlsym to load dynamic library and obtain a pointer to function defined in the library.
See C++ dlopen mini HOWTO for description of solving problems with C++ symbols in dynamic libraries.

Compiling libmagic statically (c/c++ file type detection)

Thanks to the guys that helped me with my previous question (linked just for reference).
I can place the files fileTypeTest.cpp, libmagic.a, and magic in a directory, and I can compile with g++ -lmagic fileTypeTest.cpp fileTypeTest. Later, I'll be testing to see if it runs in Windows compiled with MinGW.
I'm planning on using libmagic in a small GUI application, and I'd like to compile it statically for distribution. My problem is that libmagic seems to require the external file, magic. (I'm actually using my own shortened and compiled version, magic_short.mgc, but I digress.)
A hacky solution would be to code the file into the application, creating (and deleting) the external file as needed. How can I avoid this?
added for clarity:
magic is a text file that describes properties of different filetypes. When asked to identify a file, libmagic searches through magic. There is a compiled version, magic.mgc that works faster. My application only needs to identify a handful of filetypes before deciding what to do with them, so I'll be using my own magic_short file to create magic_short.mgc.
This is tricky, I suppose you could do it this way... by the way, I have downloaded the libmagic source and looking at it...
There's a function in there called magic_read_entries within the minifile.c (this is the pure vanilla source that I downloaded from sourceforge where it is reading from the external file.
You could append the magic file (which is found in the /etc directory) to the end of the library code, like this cat magic >> libmagic.a. In my system, magic is 474443 bytes, libmagic.a is 38588 bytes.
In the magic.c file, you would need to change the magichandle_t* magic_init(unsigned flags) function, at the end of the function, add the line magic_read_entries and modify the function itself to read at the offset of the library itself to pull in the data, treat it as a pointer to pointer to char's (char **) and use that instead of reading from the file. Since you know where the offset is to the library data for reading, that should not be difficult.
Now the function magic_read_entries will no longer be used, as it is not going to be read from a file anymore. The function `magichandle_t* magic_init(unsigned flags)' will take care of loading the entries and you should be ok there.
If you need further help, let me know,
Edit:
I have used the old 'libmagic' from sourceforge.net and here is what I did:
Extracted the downloaded archive into my home directory, ungzipping/untarring the archive will create a folder called libmagic.
Create a folder within libmagic and call it Test
Copy the original magic.c and minifile.c into Test
Using the enclosed diff output highlighting the difference, apply it onto the magic.c source.
48a49,51
> #define MAGIC_DATA_OFFSET 0x971C
> #define MAGIC_STAT_LIB_NAME "libmagic.a"
>
125a129,130
> /* magic_read_entries is obsolete... */
> magic_read_entries(mh, MAGIC_STAT_LIB_NAME);
251c256,262
<
---
>
> if (!fseek(fp, MAGIC_DATA_OFFSET, SEEK_SET)){
> if (ftell(fp) != MAGIC_DATA_OFFSET) return 0;
> }else{
> return 0;
> }
>
Then issue make
The magic file (which I copied from /etc, under Slackware Linux 12.2) is concatenated to the libmagic.a file, i.e. cat magic >> libmagic.a. The SHA checksum for magic is (4abf536f2ada050ce945fbba796564342d6c9a61 magic),
here's the exact data for magic
(-rw-r--r-- 1 root root 474443 2007-06-03 00:52 /etc/file/magic) as found on my system.
Here's the diff for the minifile.c source, apply it and rebuild minifile executable by running make again.
40c40
< magic_read_entries(mh,"magic");
---
> /*magic_read_entries(mh,"magic");*/
It should work then. If not, you will need to adjust the offset into the library for reading by modifying the MAGIC_DATA_OFFSET. If you wish, I can stick up the magic data file into pastebin. Let me know.
Hope this helps,
Best regards,
Tom.
I can tell you how to compile a library in statically - you simply pass the path to the .a file on the end of your g++ command - .a files are just archives of compiled objects (.o). Using "ldd fileTypeTest" will show you the dynamically linked libraries - ${libdir}/libmagic.so shouldn't be in it.
As for linking in an external data file... I don't know - Can you not package the application (.deb|.rpm|.tar.bz2)? On windows, I'd write an installer using NSIS.
In the past I've built self extracting archives. Basically it is a .exe file consisting of a .zip archive and code to unzip it. download the .exe, run it, and poof! you can have as many files as you want.
http://en.wikipedia.org/wiki/Self-extracting_archive