I'm on Linux with g++ 5.3.0.
I thought I'd make myself an object file that, when linked, would initialized global variables Argc and Argv so that the main arguments would be available throughout the process.
Argv.hh:
#pragma once
extern char** Argv;
extern int Argc;
Argv.cc:
char** Argv;
int Argc;
static int set_argv(int argc, char** argv, char** env) { Argc = argc; Argv = argv; return 0; }
/* Put the function into the init_array */
__attribute__((section(".init_array"))) static void *ctr = (void*)&set_argv;
main.cc
#include "Argv.hh"
#include <stdio.h>
int main(){
for (int i = 0; i < Argc; ++i){
puts(Argv[i]);
}
return 0;
}
My original build script was:
com='g++ -std=c++1y'
for cc in *.cc; do $com -c $cc; done
g++ *.o
but it kept giving me a linking error. So I changed com to
gcc -x c -std=c99 and it worked, and it also worked with plain com=g++.
Each of the three compilers successfully compiles, only the linking part fails with g++ -std=c++1y.
nm *.o outputs:
For gcc -x c -std=c99:
Argv.o:
0000000000000004 C Argc
0000000000000008 C Argv
0000000000000000 t ctr
0000000000000000 t set_argv
main.o:
U Argc
U Argv
0000000000000000 T main
U puts
For g++:
Argv.o:
0000000000000008 B Argc
0000000000000000 B Argv
0000000000000000 t _ZL3ctr
0000000000000000 t _ZL8set_argviPPcS0_
main.o:
U Argc
U Argv
0000000000000000 T main
U puts
And for g++ -std=c++1y:
Argv.o:
0000000000000008 B Argc
0000000000000000 B Argv
0000000000000000 t _ZL3ctr
0000000000000000 t _ZL8set_argviPPcS0_
main.o:
0000000000000008 B Argc
0000000000000000 B Argv
0000000000000000 T main
U puts
The last set of object files fails to link with
main.o:(.bss+0x0): multiple definition of `Argv'
Argv.o:(.bss+0x0): first defined here
main.o:(.bss+0x8): multiple definition of `Argc'
Argv.o:(.bss+0x8): first defined here
collect2: error: ld returned 1 exit status
Why does g++ -std=c++1y generate B symbols for extern declarations when the other two generate (as they should?) undefined references? Is this a bug?
Related
i read here that
A function with internal linkage is only visible inside one translation unit. When the compiler compiles a function with internal linkage, the compiler writes the machine code for that function at some address and puts that address in all calls to that function (which are all in that one translation unit), but strips out all mention of that function in the ".o" file.
i compiled this code
int g_i{}; //extern
static int sg_i{}; //static
static int add(int a, int b) //internal linakge!
{
return a+b;
}
int main()
{
static int s_i{}; //static - local
int a_i{}; //auto - local
a_i = add(1,2);
return 0;
}
and compiled using g++ -c and created my main.o file
then trying nm -C main.o im getting this result:
0000000000000000 b .bss
0000000000000000 d .data
0000000000000000 p .pdata
0000000000000000 r .rdata$zzz
0000000000000000 t .text
0000000000000000 r .xdata
U __main
0000000000000000 t add(int, int)
0000000000000004 b sg_i
0000000000000008 b main::s_i
0000000000000000 B g_i
0000000000000014 T main
can you please explain why those internal identifier are still mentioned in the object file while i heard that linker using these object files will have no idea about their existence?
thanks.
The linker knows that there is such function. However it also knows that the function that the function with internal linkage is only visible in the translation that translation unit. More simply, it just forbids the call of that function outside the translation unit.
That's why you need those internal identifiers, so that the linker knows that this function belongs only to this translation unit.
I know how to use inline keyword to avoid 'multiple definition' while using C++ template. However, what I am curious is that how linker is distinguishing which specialization is full specialization and violating ODR and reporting error, while another specialization is implicit and correctly handle it?
From the nm output, we can see duplicated definitions in main.o and other.o for both int-version max() and char-version max(), but C++ linker only reports 'multiple definition error for char-version max()' but let 'char-version max() go a successful link? How linker differentiate them and does this?
// tmplhdr.hpp
#include <iostream>
// this function is instantiated in main.o and other.o
// but leads no 'multiple definition' error by linker
template<typename T>
T max(T a, T b)
{
std::cout << "match generic\n";
return (b<a)?a:b;
}
// 'multiple definition' link error if without inline
template<>
inline char max(char a, char b)
{
std::cout << "match full specialization\n";
return (b<a)?a:b;
}
// main.cpp
#include "tmplhdr.hpp"
extern int mymax(int, int);
int main()
{
std::cout << max(1,2) << std::endl;
std::cout << mymax(10,20) << std::endl;
std::cout << max('a','b') << std::endl;
return 0;
}
// other.cpp
#include "tmplhdr.hpp"
int mymax(int a, int b)
{
return max(a, b);
}
Test output on Ubuntu is reasonable; but output on Cygwin is rather strange and confusing...
==== Test on Cygwin ====
g++ linker only reported 'char max(char, char)' is duplicated.
$ g++ -o main.exe main.cpp other.cpp
/usr/lib/gcc/x86_64-pc-cygwin/11/../../../../x86_64-pc-cygwin/bin/ld:
/tmp/ccYivs3O.o:other.cpp:(.text$_Z3maxIcET_S0_S0_[_Z3maxIcET_S0_S0_]+0x0):
multiple definition of `char max<char>(char, char)';
/tmp/cc7HJqbS.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
I dumped my .o object file and found no many clues (maybe I am not quite familiar with object format spec.).
$ nm main.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <<-- implicit specialization
U mymax(int, int)
$ nm other.o | grep max | c++filt.exe
0000000000000000 p .pdata$_Z3maxIcET_S0_S0_
0000000000000000 p .pdata$_Z3maxIiET_S0_S0_
0000000000000000 t .text$_Z3maxIcET_S0_S0_
0000000000000000 t .text$_Z3maxIiET_S0_S0_
0000000000000000 r .xdata$_Z3maxIcET_S0_S0_
0000000000000000 r .xdata$_Z3maxIiET_S0_S0_
000000000000009b t _GLOBAL__sub_I__Z5mymaxii
0000000000000000 T char max<char>(char, char) <-- full specialization
0000000000000000 T int max<int>(int, int) <-- implicit specialization
0000000000000000 T mymax(int, int)
==== Test on Ubuntu ====
This is what I have got on my Ubuntu with g++-9 after having remove inline from tmplhdr.hpp
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ g++ -o main main.o other.o
/usr/bin/ld: other.o: in function `char max<char>(char, char)':
other.cpp:(.text+0x0): multiple definition of `char max<char>(char, char)'; main.o:main.cpp:(.text+0x0): first defined here
collect2: error: ld returned 1 exit status
'char-version max()' is marked with T which is not allowed to have multiple definitions; but 'in-version max()' is marked as W which allows multiple definitions. However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm main.o | grep max | c++filt
0000000000000133 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
U mymax(int, int)
tony#Win10Bedroom:/mnt/c/Users/Tony Su/My Documents/cpphome$ nm other.o | grep max | c++filt
00000000000000d7 t _GLOBAL__sub_I__Z3maxIcET_S0_S0_
0000000000000000 T char max<char>(char, char)
0000000000000000 W int max<int>(int, int)
000000000000003e T mymax(int, int)
However, I start to be curious why nm gives different marks on Cygwin than on Ubuntu?? and Why linker on Cgywin can handle two T definitions correctly?
You need to understand that the nm output does not give you the full picture.
nm is part of binutils, and uses libbfd. The way this works is that various object file formats are parsed into libbfd-internal representation, and then tools like nm print that internal representation in human-readable format.
Some things get "lost in translation". This is the reason you should ~never use e.g. objdump to look at ELF files (at least not at the symbol table of the ELF files).
As you correctly deduced, the reason multiple max<int>() symbols are allowed on Linux is that the compiler emits them as a W (weakly defined) symbol.
The same is true for Windows, except Windows uses older COFF format, which doesn't have weak symbols. Instead, the symbol is emitted into a special .linkonce.$name section, and the linker knows that it can select any such section into the link, but should only do that once (i.e. it knows to discard all other duplicates of that section in any other object file).
This question arose in the context of this question: Find unexecuted lines of c++ code
When searching for this problem most people tried to add code and variables into the same section - but this is definitely not the problem here. Here is a minimal working example:
unsigned cover() { return 0; }
#define COV() do { static unsigned cov[2] __attribute__((section("cov"))) = { __LINE__, cover() }; } while(0)
inline void foo() {
COV();
}
int main(int argc, char* argv[])
{
COV();
if (argc > 1)
COV();
if (argc > 2)
foo();
return 0;
}
which results with g++ -std=c++11 test.cpp (g++ (GCC) 4.9.2 20150212 (Red Hat 4.9.2-6)) in the following error:
test.cpp:6:23: error: cov causes a section type conflict with cov
COV();
^
test.cpp:11:30: note: ‘cov’ was declared here
COV();
^
The error is not very helpful though, as it does not state why this is supposed to be a conflict. Both the .ii and .s temporary files give no hint as to what might be the problem. In fact there is only one section definition in the .s file
.section cov,"aw",#progbits
and I don't see why the next definition should conflict with this ("aw",#progbits is correct...).
Is there any way to get more information on this? See what the precise
conflict is? Or is this just a bug...?
The message is indeed very bad, but it isn't a bug.
The problem here occurs with inline function foo()
and occurs because Inline functions must be defined in each translation context where they used. In this link we can read about section attribute:
"..uninitialized variables tentatively go in the common (or bss) section and can be multiply ‘defined’. Using the section attribute changes what section the variable goes into and
may cause the linker to issue an error if an uninitialized variable has multiple definitions...".
Thus, when the foo function needs to be 'defined' in function main, the linker finds cov variable previously defined in inline function foo and issues the error.
Let’s make the pre-processor's work and expand COV() define to help to clarify the problem:
inline void foo()
{
do { static unsigned cov[2] __attribute__((section("cov"))) = { 40, cover() }; } while(0);
}
int main(int argc, char *argv[]) {
do { static unsigned cov[2] __attribute__((section("cov"))) = { 44, cover() }; } while(0);
if (argc > 1)
do { static unsigned cov[2] __attribute__((section("cov"))) = { 47, cover() }; } while(0);
if (argc > 2)
foo();
To facilitate reasoning, let’s alter the section attribute of definition in foo inline function to cov.2 just to compile the code. Now we haven’t the error, so we can examine the object (.o) with objdump:
objdump -C -t -j cov ./cmake-build-debug/CMakeFiles/stkovf.dir/main.cpp.o
./cmake-build-debug/CMakeFiles/stkovf.dir/main.cpp.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l d cov 0000000000000000 cov
0000000000000000 l O cov 0000000000000008 main::cov
0000000000000008 l O cov 0000000000000008 main::cov
objdump -C -t -j cov.2 ./cmake-build-debug/CMakeFiles/stkovf.dir/main.cpp.o
./cmake-build-debug/CMakeFiles/stkovf.dir/main.cpp.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l d cov.2 0000000000000000 cov.2
0000000000000000 u O cov.2 0000000000000008 foo()::cov
We can see that compiler makes foo::cov, in section cov.2 GLOBAL (signed by ‘u’ letter).
When we use the same section name (cov), the compiler, trying to ‘define’ foo in main block encounters a previous globally defined cov and the issues the error.
If you make inline foo static (inline static void foo() . . .), which avoids compiler to emit code for inline function and just copies it at expansion time, you’ll see the error disappears, because there isn't a global foo::cov.
I'm trying to use the adobe xmp library in an iOS application but I'm getting link errors. I have the appropriate headers and libraries in my path, but I'm getting link errors. I double-checked to make sure the headers and library are on my path. I checked the mangled names of the methods, but they aren't in the library (I checked using the nm command). What am I doing wrong?
Library Header:
#if defined ( TXMP_STRING_TYPE )
#include "TXMPMeta.hpp"
#include "TXMPIterator.hpp"
#include "TXMPUtils.hpp"
typedef class TXMPMeta <TXMP_STRING_TYPE> SXMPMeta; // For client convenience.
typedef class TXMPIterator <TXMP_STRING_TYPE> SXMPIterator;
typedef class TXMPUtils <TXMP_STRING_TYPE> SXMPUtils;
.mm file:
#include <string>
using namespace std;
#define IOS_ENV
#define TXMP_STRING_TYPE string
#import "XMP.hpp"
void DoStuff()
{
SXMPMeta meta;
string returnValue;
meta.SetProperty ( kXMP_NS_PDF, "test", "{ formId: {guid} }" );
meta.DumpObject(DumpToString, &returnValue);
}
Link Errors:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::DumpObject(int (*)(void*, char const*, unsigned int), void*) const", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::TXMPMeta()", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::SetProperty(char const*, char const*, char const*, unsigned int)", referenced from:
(null): "TXMPMeta<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::~TXMPMeta()", referenced from:
(null): Linker command failed with exit code 1 (use -v to see invocation)
Basically what's happened is that you only have the definitions in the headers
if I say
template<class T> T something(T); somewhere, that tells the compiler "trust me bro, it exists, leave it to the linker"
and it adds the symbol to the object file as if it did exist. Because it can see the prototype it knows how much stack space, what type it returns and such, so it just sets it up so the linker can just come along and put the address of the function in.
BUT in your case there is no address. You /MUST/ have the template definition (not just declaration) in the same file so the compiler can create one (with weak linkage) so here it's assumed they exist, but no where does it actually stamp out this class from a template, so the linker doesn't find it, hence the error.
Will fluff out my answer now, hope this helps.
Addendum 1:
template<class T> void output(T&);
int main(int,char**) {
int x = 5;
output(x);
return 0;
}
This will compile but NOT link.
Output:
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
g++ build/main.o -o a.out
build/main.o: In function `main':
(my home)/src/main.cpp:13: undefined reference to `void output<int>(int&)'
collect2: error: ld returned 1 exit status
make: *** [a.out] Error 1
(I hijacked an open projet for this hence the names)
As you can see the compile command works fine (the one that ends in -o build/main.o) because we tell it "look this function exists"
So in the object file it says to the linker (in some "name managled form" to keep the templates) "put the location in memory of void output(int&); here" the linker can't find it.
Compiles and links
#include <iostream>
template<class T> void output(T&);
int main(int,char**) {
int x = 5;
output(x);
return 0;
}
template<class T> void output(T& what) {
std::cout<<what<<"\n";
std::cout.flush();
}
Notice line 2, we tell it "there exists a function, a template in T called output, that returns nothing and takes a T reference", that means it can use it in the main function (remember when it's parsing the main function it hasn't seen the definition of output yet, it has just been told it exists), the linker then fixes that. 'though modern compilers are much much smarter (because we have more ram :) ) and rape the structure of your code, link-time-optimisation does this even more, but this is how it used to work, and how it can be considered to work these days.
Output:
make all
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
g++ build/main.o -o a.out
As you can see it compiled fine and linked fine.
Multiple files without include as proof of this
main.cpp
#include <iostream>
int TrustMeCompilerIExist();
int main(int,char**) {
std::cout<<TrustMeCompilerIExist();
std::cout.flush();
return 0;
}
proof.cpp
int TrustMeCompilerIExist() {
return 5;
}
Compile and link
make all
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/main.cpp >> build/main.o.d ; then rm build/main.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/main.cpp -o build/main.o
if ! g++ -Isrc -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -MM src/proof.cpp >> build/proof.o.d ; then rm build/proof.o.d ; exit 1 ; fi
g++ -Wall -Wextra -O3 -std=c++11 -g -gdwarf-2 -Wno-write-strings -Isrc -c src/proof.cpp -o build/proof.o
g++ build/main.o build/proof.o -o a.out
(Outputs 5)
Remember #include LITERALLY dumps a file where it says "#include" (+ some other macros that adjust line numbers) this is called a translation unit. Rather than using a header file to contain "int TrustMeCompilerIExist();" which declares that the function exists (but the compiler again doesn't know where it is, the code inside of it, just that it exists) I repeated myself.
Lets look at proof.o
command
objdump proof.o -t
output
proof.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 proof.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text 0000000000000006 _Z21TrustMeCompilerIExistv
Right at the bottom there, there's a function, at offset 6 into the file, with debugging information, (the g is global though) you can see it's called _Z (this is why _ is reserved for some things, I forget what exactly... but it's to do with this) and Z is "integer", 21 is the name length, and after the name, the v is "void" the return type.
The zeros at the start btw are the section number, remember binaries can be HUGE.
Disassembly
running:
objdump proof.o -S gives
proof.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <_Z21TrustMeCompilerIExistv>:
int TrustMeCompilerIExist() {
return 5;
}
0: b8 05 00 00 00 mov $0x5,%eax
5: c3 retq
Because I have -g you can see it put the code that the assembly relates to (it makes more sense with bigger functions, it shows you what the following instructions until the next code block actually do) that wouldn't normally be there.
main.o
Here's the symbol table, obtained the same way as the above:
objdump main.o -t
main.o: file format elf64-x86-64
SYMBOL TABLE:
0000000000000000 l df *ABS* 0000000000000000 main.cpp
0000000000000000 l d .text 0000000000000000 .text
0000000000000000 l d .data 0000000000000000 .data
0000000000000000 l d .bss 0000000000000000 .bss
0000000000000000 l d .text.startup 0000000000000000 .text.startup
0000000000000030 l F .text.startup 0000000000000026 _GLOBAL__sub_I_main
0000000000000000 l O .bss 0000000000000001 _ZStL8__ioinit
0000000000000000 l d .init_array 0000000000000000 .init_array
0000000000000000 l d .debug_info 0000000000000000 .debug_info
0000000000000000 l d .debug_abbrev 0000000000000000 .debug_abbrev
0000000000000000 l d .debug_loc 0000000000000000 .debug_loc
0000000000000000 l d .debug_aranges 0000000000000000 .debug_aranges
0000000000000000 l d .debug_ranges 0000000000000000 .debug_ranges
0000000000000000 l d .debug_line 0000000000000000 .debug_line
0000000000000000 l d .debug_str 0000000000000000 .debug_str
0000000000000000 l d .note.GNU-stack 0000000000000000 .note.GNU-stack
0000000000000000 l d .eh_frame 0000000000000000 .eh_frame
0000000000000000 l d .comment 0000000000000000 .comment
0000000000000000 g F .text.startup 0000000000000026 main
0000000000000000 *UND* 0000000000000000 _Z21TrustMeCompilerIExistv
0000000000000000 *UND* 0000000000000000 _ZSt4cout
0000000000000000 *UND* 0000000000000000 _ZNSolsEi
0000000000000000 *UND* 0000000000000000 _ZNSo5flushEv
0000000000000000 *UND* 0000000000000000 _ZNSt8ios_base4InitC1Ev
0000000000000000 *UND* 0000000000000000 .hidden __dso_handle
0000000000000000 *UND* 0000000000000000 _ZNSt8ios_base4InitD1Ev
0000000000000000 *UND* 0000000000000000 __cxa_atexit
See how it says undefined, that's because it doesn't know where it is, it just knows it exists (along with the standard lib stuff, which the linker will find itself)
In closing
USE HEADER GUARDS and with templates put #include file.cpp at the bottom BEFORE the closing header guard. that way you can include header files as usual :)
The answer to your question is present in ever sample that comes with XMP SDK Toolkit.Clients must compile XMP.incl_cpp to ensure that all client-side glue code is generated. Do this by including it in exactly one of your source files.
For your ready reference I am pasting below a more detailed explanation present in section Template classes and accessing the API of XMPProgrammersGuide.pdf that comes with XMP SDK Toolkit
Template classes and accessing the API
The full client API is defined and documented in the TXMP*.hpp header files. The TXMP* classes are C++ template classes that must be instantiated with a string class such as std::string, which is used to return text strings for property values, serialized XMP, and so on. To allow your code to access the entire XMP API you must:
Provide a string class such as std::string to instantiate the template classes.
Provide access to XMPCore and XMPFiles by including the necessary defines and headers. To do this, add the necessary define and includes directives to your source code so that all necessary code is incorporated into the build:
#include <string>
#define XMP_INCLUDE_XMPFILES 1 //if using XMPFiles
#define TXMP_STRING_TYPE std::string
#include "XMP.hpp"
The SDK provides complete reference documentation for the template classes, but the templates must be instantiated for use. You can read the header files (TXMPMeta.hpp and so on) for information, but do not include them directly in your code. There is one overall header file, XMP.hpp, which is the only one that C++ clients should include using the #include directive. Read the instructions in this file for instantiating the template classes. When you have done this, the API is available through the concrete classes named SXMP*; that is, SXMPMeta, SXMPUtils, SXMPIterator, and SXMPFiles. This document refers to the SXMP* classes, which you can instantiate and which provide static functions.
Clients must compile XMP.incl_cpp to ensure that all client-side glue code is generated. Do this by including it in exactly one of your source files.
Read XMP_Const.h for detailed information about types and constants for namespace URIs and option flags.
I have a c++ program which includes an external dependency on an empty xlsx file. To remove this dependency I converted this file to a binary object in view of linking it in directly, using:
ld -r -b binary -o template.o template.xlsx
followed by
objcopy --rename-section .data=.rodata,alloc,load,readonly,data,contents template.o template.o
Using objdump, I can see three variables declared :
$ objdump -x template.o
template.o: file format elf64-x86-64
template.o
architecture: i386:x86-64, flags 0x00000010:
HAS_SYMS
start address 0x0000000000000000
Sections:
Idx Name Size VMA LMA File off Algn
0 .rodata 00000fd1 0000000000000000 0000000000000000 00000040 2**0
CONTENTS, ALLOC, LOAD, READONLY, DATA
SYMBOL TABLE:
0000000000000000 l d .rodata 0000000000000000 .rodata
0000000000000fd1 g *ABS* 0000000000000000 _binary_template_xlsx_size
0000000000000000 g .rodata 0000000000000000 _binary_template_xlsx_start
0000000000000fd1 g .rodata 0000000000000000 _binary_template_xlsx_end
I then tell my program about this data :
template.h:
#ifndef TEMPLATE_H
#define TEMPLATE_H
#include <cstddef>
extern "C" {
extern const char _binary_template_xlsx_start[];
extern const char _binary_template_xlsx_end[];
extern const int _binary_template_xlsx_size;
}
#endif
This compiles and links fine,(although I am having some trouble automating it with cmake, see here : compile and add object file from binary with cmake)
However, when I use _binary_template_xlsx_size in my code, it is interpreted as a pointer to an address that doesn't exist. So to get the size of my data, I have to pass (int)&_binary_template_xlsx_size (or (int)(_binary_template_xlsx_end - _binary_template_xlsx_start))
Some research tells me that the *ABS* in the objdump above means "absolute value" but I don't get why. How can I get my c++ (or c) program to see the variable as an int and not as a pointer?
An *ABS* symbol is an absolute address; it's more often created by passing --defsym foo=0x1234 to ld.
--defsym symbol=expression
Create a global symbol in the output file, containing the absolute
address given by expression. [...]
Because an absolute symbol is a constant, it's not possible to link it into a C source file as a variable; all C object variables have an address, but a constant doesn't.
To make sure you don't dereference the address (i.e. read the variable) by accident, it's best to define it as const char [] as you have with the other symbols:
extern const char _binary_template_xlsx_size[];
If you want to make sure you're using it as an int, you could use a macro:
extern const char _abs_binary_template_xlsx_size[] asm("_binary_template_xlsx_size");
#define _binary_template_xlsx_size ((int) (intptr_t) _abs_binary_template_xlsx_size)