What is the format of .o Object Files? - c++

I've been learning low-level stuff for a while now and I'm quite familiar with PE files and how they work. But I've been unable to find any documentation of their precursor .o files. I'm looking for a deep-dive on the actual byte-by-byte structure of them.
What does the format of .o files depend on? Is it even platform specific at all? Perhaps compiler-specific? Where can I find documentation on the file format of .o files? Mostly interested in Windows.

.o files as produced by GCC and LLVM when using Windows (MinGW) are COFF files (see https://wiki.osdev.org/COFF). On certain other platforms the ELF format is used.
The Windows PE format (used for .dll and .exe files) is actualy a subset of the COFF format.

Related

Can you get information out of C++ Lib Files like How you can Get Info out of Jar files?

Are C++ Lib Files binary, or just some sort of container, like a zip file, which contains all of the binary files?
I ask, because I'm curious if I can open a library file (.lib) to get more information about what files are inside of it, similar to how you can open a jar file and look through it in a human readable way.
I ask, because I'm adding some libraries to my lib path and would prefer to know if the lib files contain the classes I'm trying to reference.
As far as I know, a library file is pure binary. So it's impossible to actually 'see' it's contents like a zip file.
If you got hold of some .lib files then it's probable it also came with documentation that explains it's functionality. That would be a good place to check if your classes are present in the library.
EDIT: This question describes a lib file inspector called dumpbin, might be what you need.
A lib file contains the compiled binary of all of the compilation units that are provided by the library. Since you have tagged C++Builder, I assume you have OMF-libraries. You can easily get quite a lot of information out of these, for example all function signatures in the library.
C++Builder ships with a tool called TDump that prints contents of the library in human-readable form. It's located in the bin-directory under the C++Builder installation directory.
The example below shows you how to use TDump to dump contents of a library from the command line:
"C:\Program Files\Embarcadero\RAD Studio\10.0\bin\tdump.exe" library.lib > library-dump.txt
You can find each object module in the library by searching the output for "THEADR". After the THEADR-line you will have a list of all of the dependency files (basically includes) used when the object was compiled. After the dependencies there are the symbols, including demangled function signatures.

query on mark pieter article on pe_coff spec

In first 2 pages, layout diagram of PE executable and COFF format is given.
So, my question is,
I assume windows object files and executables are in COFF format, SO WHAT IS PE EXECUTABLE FORMAT?
Sham
Microsoft PE and COFF Specification:
This specification describes the structure of executable (image) files and object files under the Windows family of operating systems. These files are referred to as Portable Executable (PE) and Common Object File Format (COFF) files, respectively.

How to add a folder in the header and how does .a works?

I am currently working in Ubuntu Linux. I am working with
a .hpp file and .cpp file. From these two I am
creating an .a file (like a dll in order to use and work with
my application on any computer that has linux installed).
I mention the fact that both .hpp and.cpp are in folder 1.
I would like to ask :
If I include in .cpp a header from a folder like:
#include "/home/tests/folder1/folder2/header.h"
will this work correctly after i create the .a using ar rcs and send my .a on another computer?
does the path to a specific header from a folder influence the .a created?
If I had to download for example gsoap in order
to accomplish my task, after I've created
the .a file that contains a lot of .xml and .cpp/.h files
from gsoap and from my .h and .cpp file do I need
to create a makefile in order to download gsoap
on the computer where I want to use my .a (or dll in windows)
application?
"Any computer that has linux installed" isn't going to work. Linux encases a wide variety of platforms and architectures, unlike Windows which generally encases only two (and the 64-bit versions are backward compatible with the 32-bit versions, so the .dlls always work).
As already mentioned elsewhere, a .a is a static library and is equivalent to a Windows .lib, not a .dll. The Linux equivalent to a Windows .dll is a .so "shared object".
No. #includes are resolved by the pre-processor in a step prior to compilation. The contents of the file are literally inserted into the copy of your source file in memory, then the whole lot is compiled. The string with the folder does not exist in your actual compiled module.That said, writing absolute paths is very bad form. It means you cannot move your development environment/directory anywhere. Use relative paths: they should be relative to your current working directory and/or to your defined include path. Read your toolchain's documentation for more information.
If you statically link gsoap, then you don't have to do anything. It's compiled into your project.If you want to dynamically link it, then your .a should not contain any .cpp files from gsoap. The target computer must have gsoap shared libraries installed, and this will be a required dependency that your installer or your user must resolve.Makefiles do not download dependencies. Package managers do.
Actually, a ".a" file is an archive file. Linux chose that format for its library files, so you can compare it to ".lib" (".so" is the rough equivalent of ".dll").
There are a number of stages of compilation: preprocessing, compiling, assembling and linking.
Preprocessing effectively answers your first question because code in the .h/.hpp file is inserted into the .cpp file, meaning that when your code is compiled, all code that is necessary to compile the .cpp file successfully is in that one file.
Compiling turns your code into assembly instructions for the specific computer that you're using. This means that if your code was built to run on a PowerPC computer (Mac), your code would use machine instructions that any PPC computer could use (meaning that Intel, AMD, SPARC, Alpha, etc. computers couldn't use your code). This answers your question about moving a ".a" file to another computer - you can use it as long as the computer's processor AND operating system is compatible (you may have a 64-bit processor, but that doesn't mean 32-bit Windows will let you use it to its full capacity).
Assembling converts the primitive text-based assembly instructions into machine instructions that the processor can understand. This creates an object file (.obj on Windows, .o on Linux). This file is what goes in the library (.lib on Windows, .a on Linux). There are other names for machine instructions such as "machine code" and "object code", and any one of them can be used to describe the same thing.
Linking is the last stage. It takes the necessary code from libraries and the various necessary object files and turns them into an actual binary (.exe file on Windows, Linux doesn't need an extension because of how it is designed). This is your application.
Because linking is the last stage, the gsoap library (for example) must be specified in addition to your library or else the linker will say it couldn't find certain "symbols". However, as with your library, the gsoap library must be on that computer to be able to use it. Installing it with the package manager is preferred when possible, but if you can't do that, you need to compile it on that computer. If you're moving from a PPC computer to an Intel/AMD computer, you would also need to re-compile your library as well as gsoap (if you couldn't install gsoap via package manager).
"does the path to a specific header from a folder influence the .a created?" - Only may be debug information. Nothing that would prevent it from working if you copy it to another place.
*.a is a static library. It is like *.lib in windows - not like *.dll
You can move any static and dynamic libraries (*.a/*.so in linux, *.lib/*.dll in windows) to any folder/computer/planet you like and use it there while dependencies are satisfied (available all the necessary dynamic and static libs, software, hardware that your library depend on). Of course running the code using your library will require the CPU architecture you've compiled for and all dynamic libs your code uses directly or indirectly.
Not directly related to the question asked: don't #include files by absolute paths. Ever. Define and use include directories. It is a matter of style and readability. Includes like "/home/user/working_dir/blabla.h" or "D:/working_dir/blabla.h" or "..\..\some\directory\blabla.h" is ugly and unmaintainable. Includes like <blabla.h> or <blabla/defs.h> is perfect for library APIs and like "blabla.h" or "subdir/blabla.h" is OK for internal headers.

What is the content of OBJ file?

I know that a OBJ file produced after compilation of C/C++ source code in any standard compiler generates OBJ file, which later LINKed with the rest of the required libraries to form the EXEcutable file. I want to know the format/structure of the OBJ file. Please go ahead.
C++ Builder (and Delphi) use OMF format obj files. See this wikipedia link for details.
Additional information: Microsoft Visual C++ use an incompatible COFF, that's why C++ Builder have a utility to convert them.
See also: What's the difference between the OMF and COFF format?
the .obj file is a format used by Microsoft Compilers and is described in the (Common Object File Format) COFF spec
other compilers use different formats to store object code, e.g. ELF on Linux
Under windows, it'd be a COFF object. Google this file format for a spec. They are linked to produce a PE.

Converting COFF lib file to OMF format

Is there any way to convert COFF library (lib file) to OMF library for using with C++Builder6 ? This coff is not just import library, it conatians some code.
When I try to convert it using borland's coff2omf.exe, I get 1KB file from 15KB file.
Instead of DigitalMars converter, you may use the Object file converter -- objconv -- available at agner.org/optimize
This utility can be used for converting object files between COFF/PE,
OMF, ELF and Mach-O formats for all 32-bit and 64-bit x86 platforms.
Can modify symbol names in object files. Can build, modify and convert
function libraries across platforms. Can dump object files and
executable files. Also includes a very good disassembler supporting
the SSE4, AVX, AVX2, AVX512, FMA3, FMA4, XOP and Knights Corner
instruction sets. Source code included (GPL).
This is a great site for low-level optimization, and there are a lot of useful information in the associated manual PDF file, about the library formats across several platforms.
It's fairly typical for an OMF object file to be a lot smaller than an equivalent COFF object, so what you're getting may well be valid.
If you find that it's really not, you can probably break the lib file into individual object files, disassemble the object files, re-assemble them to OMF object files, and put those together into an OMF lib file.
This is rather late, but if anyone is looking for an answer, you can checkout COFFIMPLIB from DigitalMars. COFF2OMF is available at the same site, but it looks like that's older.
It may be worth noting that in newer versions of Delphi (>= XE2), the compiler accepts COFF as well as OMF. It's probably also true for C++ Builder. The 64 bit compilers use only COFF.
See here for more informations about linking COFF.
Integrating Delphi (omf) and Ada (gcc, coff) required lots of effort until I've given up doing it in a single exe.
I honestly tried to disintegrate gcc rtl and ada rtl .a (coff libraries) into lots of .o (objects), convert them via coff2omf (there were DMD coff2omf and iirc another convobj or so). Some of the coff .o failed to be converted to .obj so I can't say if it was a reliable way at all.
Assembler level conversion is not so simple when it takes to exceptions and other deep details.
It's a pity I haven't tried a tool named
ftp://ftp.styx.cabel.net/pub/UniLink/
It's not obvious, but UniLink can probably be used to achieve the goal. One of its targets is C++ Builder package (both dynamic and static). unilink -Tpp -GI should do the trick