What does a dangerous relocation error mean? - c++

I am getting a linking error:
dangerous relocation: l32r: Literal placed after use:
I am still trying to debug; however, I want to better understand this error. I understand what relocation is; however, I am not sure how it can be dangerous and was looking for some clarification. Also, a small code snippet that could generate this type of error would be helpful.
In short, what is "a dangerous relocation"?

This is a two-part answer, as there are really two questions here, one general ("what's a dangerous relocation?") and one specific to the Xtensa ("why can't you have a literal placed after where it's used in the code?").
What's all this dangerous relocation stuff about, anyway?
To understand what a 'dangerous relocation' is, we must first understand what a relocation is. As a compiler is generating an object file from some piece of code, it will need to reference symbols that are defined somewhere else: perhaps in another object file in the link, or perhaps in a shared library. However, the compiler does not know the addresses of external symbols when compiling a given object file. It must emit a relocation to serve as a named placeholder, telling the linker "OK, shove the address of foobar into this spot, and oh, you have to do X, Y, and Z to it to make it fit into the instructions there."
Most of the time, this works without a hitch, you get a binary out of your linker, and Bob's your uncle. When this process breaks down, and the linker cannot make the address of the symbol the compiler gave it fit into the instructions at the site of the relocation, it gives up and tosses out a 'dangerous relocation' message (among others -- the all-too-common 'relocation truncated to fit' pops out of this process as well) to inform the programmer that something has gone terribly wrong.
What's wrong with a literal placed after where it's used?
Now that we know what a generic 'dangerous relocation' is, we can move on to the second half of the error message, namely "l32r: Literal placed after use". The Xtensa uses an instruction known as L32R to load constant values from memory that don't fit into the Xtensa's MOVI immediate load instruction, which has a 12-bit signed immediate field. The L32R instruction is described in the Xtensa ISA reference as follows:
L32R is a PC-relative 32-bit load from memory. It is typically used to load constant
values into a register when the constant cannot be encoded in a MOVI instruction.
L32R forms a virtual address by adding the 16-bit one-extended constant value encoded
in the instruction word shifted left by two to the address of the L32R
plus three with the two least significant bits cleared. Therefore, the offset can always
specify 32-bit aligned addresses from -262141 to -4 bytes from the address of the L32R
instruction. 32 bits (four bytes) are read from the physical address. This data is then
written to address register at.
Given the restrictions on L32R quoted above, the error message breaks down quite nicely: the compiler generated a L32R to load a constant (which could be a value or an address) somewhere in your code, but either the constant's value was not available to the compiler (think extern const), or the address needed to be filled in by the linker (this is the likely case). So, it emitted this L32R relocation to tell the linker to 'fill in the blank' in the L32R instruction with the address of a constant value or constant address somewhere in your program. However, the linker couldn't find anywhere in the previous 256KB of code -- or literal pool, depending on how your compiler and Xtensa core are configured -- to shove a constant, so it gave up and spat out the error message you asked about.
How does one fix this?
Unfortunately, a 'dangerous relocation' of this sort depends on code size, so unless you have a bona fide compiler or linker bug on your hands, reproducing it with a small snippet of code will be impossible. There are two possible causes you can try to address, though.
There's no room for my literal pool!
If you are compiling with -mno-text-section-literals (which is the default), the linker gets fed the literal pools as separate sections which it then has to interleave with the code sections. If you have a particularly large object file in your link, it may have over 256KB of code in its .text section, leaving nowhere in the range of a L32R instruction for the linker to place the associated literal pool section at. Compiling with -mtext-section-literals should eliminate the error; if it does not work, you have that flag on already, or if you are using -ffunction-sections (which places each function into its own section; it is sometimes used in embedded work to allow the linker to throw out unused code), read on.
The linker (or assembler) still can't find a place to put my literals!
When the compiler and assembler are told to emit literals into the text section, they restrict placement of the literal pools to before the functions that use them (i.e. before the ENTRY instruction of the function) in order to minimize the risk that the literal pools will be executed as code, with obviously bad results. If you have an extremely long function in your code -- I shudder to think what sort of function could generate more than 256KB of code -- the 'default' literal pool placed before the ENTRY instruction can wind up out of range of L32R instructions near the end of the function. Normally, the compiler will emit an assembler directive known as .literal_position, as well as a jump around the mid-function literal pool, to provide the assembler and linker with an extra place to shove literals into. You can tell the compiler to output an assembler listing using -save-temps and then search it for .literal_position directives; if one isn't present in a function that has L32R instructions past the 256KB mark, congratulations! You just found a compiler bug!
What else could happen to produce this?
The only other circumstance I see that can provoke such a problem is if there is nowhere before the ENTRY instruction that the compiler or linker can put a literal pool, and the compiler can't figure this out on its own -- this can occur with interrupt handlers, or functions that are explicitly placed at the beginning of a physical memory boundary by the linker script. In this case, you will need to insert the .literal_position directive and its associated jump & label by hand in an asm statement at the top of the culprit function in order to provide the assembler with a place to put the culprit function's literals. As the GAS manual puts it:
The assembler will automatically place text section literal pools before ENTRY
instructions, so the .literal_position directive is only needed to specify some other
location for a literal pool. You may need to add an explicit jump instruction to skip
over an inline literal pool.
For example, an interrupt vector does not begin with an ENTRY instruction so the
assembler will be unable to automatically find a good place to put a literal pool.
Moreover, the code for the interrupt vector must be at a specific starting address, so
the literal pool cannot come before the start of the code. The literal pool for the
vector must be explicitly positioned in the middle of the vector (before any uses of the
literals, due to the negative offsets used by PC-relative L32R instructions).
Wait, I'm using the absolute literal option!
If you have the LITBASE option enabled in your Xtensa core and are getting this error, this is a sign that your literal pool has overflowed. The compiler should generate the 'glue' needed to switch literal pools in this case, though: if it doesn't, congratulations! You have just found a compiler bug!

Here's http://www.mail-archive.com/mspgcc-users#lists.sourceforge.net/msg11488.html
This might be helpful for you.
Good luck :)

Related

Access voilation reading location

I am trying to debugg the project on MSVS 2010.
Implementation - c++; when i am degubbing the source code, i get the following failure reported by MSVS.
Failure reported:
"First chance exception at 0x00000013fb5b9ee in unit.exe: 0xc00000005 access voilation reading location 0x00000000000000c."
the problem lies in obtaining address.
int base = (*(abc::g_runc1.m_paulsenderpin.m_lastchunk_p)).xcpp::cxcppoutput::m_baseaddress;
my project is very big to include the source code,
In short it can be described as:
- paul is a module with sender pin connected to c1.
- xcpp is the interface
this source code and the project is correct and works without failure on ARM compiler, but on MSVS it gives access violation error.
On msdn there are some posts about permission set by assembly, and which avoids to read the addressed location. if so, how to change it... ?
or is there any better option to find the problem...?
Any help is appreciated.
Your code is trying to access location that actually isn't owned by it's process. No data of user applications can be located at addresses so close to zero. As your expressions is too long to simply find where is the member containing zero reference, my tip is m_last chunk_p, and the m_baseaddress seems to be member at offset 12.
There is one simple explanation why does your code work fine when it's compiled by something that works with ARM: ARM uses aligned memory access, so class and structure members are aligned to full blocks, although they don't always use whole space allocated for them. Therefore you use bigger pointer or wrong memset parameters somewhere in your code and your pointer gets overwritten.
Problem may also disappear when you compile it with another version of (possibly another) compiler (or non machine with different processor architecture 32/64), as the size of fundamental types isn't always the same.
You should try to check what pointed is actually zero (or possibly 12) in your expression and try to set a watch on it. Be sure you use sizeof properly everywhere.
The Problem lies with the memory addressing, in ARM debugger 32 bits and MSVS10 48 bits of addressing, because of it the MSB byte is lost and so cannot find the correct memory address...!!!

gdb : findind every jumps to an address

I'm trying to understand a small binary using gdb but there is something I can't find a way to achieve : how can I find the list of jumps that point to a specified address?
I have a small set of instructions in the disassembled code and I want to know where it is called.
I first thought about searching the corresponding instruction in .text, but since there are many kind of jumps, and address can be relative, this can't work.
Is there a way to do that?
Alternatively, if I put a breakpoint on this address, is there a way to know the address of the previous instruction (in this case, the jump)?
If this is some subroutine being called from other places, then it must respect some ABI while it's called.
Depending on a CPU used, the return address (and therefore a place from where it was called) will be stored somewhere (on stack or in some registers). If you replace original code with the one that examines this, you can create a list of return addresses. Or simpler, as you suggested, if you use gdb and put a breakpoint at that routine, you can see from where it was called by using a bt command.
If it was actual jump (versus a "jump to subroutine") that led you there (which I doubt, if it's called from many places, unless it's a kind of longjmp/setjmp), then you will probably not be able to determine where this was called from, unless the CPU you are using allows you to trace the execution in some way.

how to use macro for a unsigned long number?

Here're my codes:
#define MSK 0x0F
#define UNT 1
#define N 3000000000
unsigned char aln[1+N];
unsigned char pileup[1+N];
void set(unsigned long i)
{
if ((aln[i] & MSK) != MSK ) {
aln[i] += UNT;
}
}
int main(void) {}
When I try to compile it, the compiler complains like this:
tmp/ccJ4IgSa.o: In function `set':
bitmacs.c:(.text+0xf): relocation truncated to fit: R_X86_64_32S against symbol `aln' defined in COMMON \
section in /tmp/ccJ4IgSa.o
bitmacs.c:(.text+0x29): relocation truncated to fit: R_X86_64_32S against symbol `aln' defined in COMMON\
section in /tmp/ccJ4IgSa.o
bitmacs.c:(.text+0x32): relocation truncated to fit: R_X86_64_32S against symbol `aln' defined in COMMON\
section in /tmp/ccJ4IgSa.o
I think the reason may be the N is too big, because it can compile successfully if I change N to 2000000000. But I need 3000000000 as the value of N..
Anyone has idea about that?
Per your original question: use the integer literal suffix UL (or similar) to force the storage type of N:
#define N 3000000000UL
However, (per your comment on HLundvall's answer) the relocation truncated to fit error obviously isn't due to this - it may (as Mystical and Matt Lacey say) simply be too big to fit in the segment.
As an aside, if you ask a seperate question explaining what you're trying to accomplish with your huge arrays, someone may be able to suggest a better solution (that is more likely to fit in memory)
For example:
your sample code is only using the low nibble of each byte in the code shown: you could pack this into half the size (which is admittedly still much too large)
depending on your access patterns, you might be able to keep the array on disk and cache a working subset in memory
there may be better overall algorithms and data structures if we knew what you needed
Disregarding the "formal" problem that your numeric literal isn't of the correct type (see the other answers for the correct syntax), the key point here is that it's a very bad idea to allocate a 3 GB static/global array.
static and global1 variables on most platforms are mapped directly from the executable image, which means that your executable would have to be as big as 3 GB, which is quite big even for current day standards. Even if on some platforms this limitation may be lifted (see the comments), you don't have any control on how to handle the failure of allocation.
Most importantly, global variables are not intended for such big stuff, and you are likely to find problems with arbitrary limits imposed by the linker (such as the one you found) and the loader. Instead, you should allocate anything that's bigger than a few KBs on the heap, using malloc, new or some platform-specific function, handling gracefully the possible failure at runtime.
Still, keep in mind that for an application running under almost any 32 bit operating system it's not possible to get 3 GB of contiguous memory as you request, and it's impossible altogether to get more than one of these arrays (=more than 4 GB of contiguous memory) without resorting to platform-specific tricks (e.g. mapping only specific parts of the arrays in memory at a given moment).
Also, are you sure that you do need all that contiguous memory since your program starts to run? Isn't there some better data structure/algorithm that could avoid allocating all that memory?
In general, what the standard calls variables with static storage duration.
To enter a numeric constant of type unsigned long use:
#define N 3000000000UL
The problem is that gcc (by default) uses pc-relative accesses to get the address of static data objects on x86_64 targets, and those accesses are limited to 2^31 bytes maximum. So if the symbol ends up getting placed more than 2GB away from the code that accesses it, you'll end up getting this link error when it tries to use an offset that is too big to fit in the 32 bits of space allowed in the instruction.
You can avoid this problem by using the -mcmodel=large option to gcc. This tells it to not assume that it can use 32-bit PC relative offsets to access symbols (among other things)
Note that the type suffix of the constant literal is mostly irrelevant -- a constant literal that is too big for an int will automatically become a long (or even long long if needed) without any suffix. See 6.4.4.1.5 of the C99 spec.
Your executable is trying to put objects in memory past the 4GB mark, which is not allowed. See this link: http://www.technovelty.org/code/c/relocation-truncated.html.
From the article: "If you're seeing this and you're not hand-coding, you probably want to check out the -mmodel argument to gcc."

Memory error: Dereference null pointer/ SSE misalignment

I'm compiling a program on remote linux server. The program compiled. However when I run it the program ends abruptly. So I debugged the program using DDT. It spits out the following error:
Process 0:
Memory error detected in ClassName::function (filename.cpp:6462).
Thread 1 attempted to dereference a null pointer or execute an SSE instruction with an
incorrectly aligned memory address (the latter may sometimes occur spuriously if guard
pages are enabled)
Tip: Use the stack list and the local variables to explore your program's current
state and identify the source of the error.
Can anyone please tell me what exactly this error means?
The line where the program stops looks like this:
SumUtility = ParaEst[0] + hhincome * ParaEst[71] + IsBlack * ParaEst[61] + IsBachAss * (ParaEst[55]);
This is within a switch case.
These are the variable types
vector<double> ParaEst;
double hhincome;
int IsBlack, Is BachAss;
Thanks for the help!
It means that:
ParaEst is NULL or a bad Pointer
ParaEst's individual array values are not aligned to 16-byte boundaries, required for SSE.
hhincome, IsBlack, or IsBachAss are not aligned to 16-byte boundaries and are SSE type values.
SumUtility is not aligned to 16-bytes and is a SSE type field.
If you could post the assembly code of the exact line that failed along with the register values of that assembler line, we could tell you exactly which of the above conditions have failed. It would also help to see the types of each variable shown to help narrow root the cause.
Ok... The problem finally got fixed.
The issue was that the expression where the code was breaking down was in a newly defined function. However for some weird reason running the make-file did not incorporate these changes and was still compiling using the previously compiled .o file. This resulted in garbage values being assigned to the variables within this new function. To top things off the program calls this function as a first step. Hence there was this systematic breakdown. The technical aspect of this was what Michael alluded to.
After this I would always recommend to use a make clean option in the make file. The issue of why running the make file is failing to compile the modified source file is an issue that definitely warrants further discussion.
Thanks for the responses!!

What does "BUS_ADRALN - Invalid address alignment" error means?

We are on HPUX and my code is in C++.
We are getting
BUS_ADRALN - Invalid address alignment
in our executable on a function call. What does this error means?
Same function is working many times then suddenly its giving core dump.
in GDB when I try to print the object values it says not in context.
Any clue where to check?
You are having a data alignment problem. This is likely caused by trying to read or write through a bad pointer of some kind.
A data alignment problem is when the address a pointer is pointing at isn't 'aligned' properly. For example, some architectures (the old Cray 2 for example) require that any attempt to read anything other than a single character from memory only occur through a pointer in which the last 3 bits of the pointer's value are 0. If any of the last 3 bits are 1, the hardware will generate an alignment fault which will result in the kind of problem you're seeing.
Most architectures are not nearly so strict, and frequently the required alignment depends on the exact type being accessed. For example, a 32 bit integer might require only the last 2 bits of the pointer to be 0, but a 64 bit float might require the last 3 bits to be 0.
Alignment problems are usually caused by the same kinds of problems that would cause a SEGFAULT or segmentation fault. Usually a pointer that isn't initialized. But it could be caused by a bad memory allocator that isn't returning pointers with the proper alignment, or by the result of pointer arithmetic on the pointer when it isn't of the correct type.
The system implementation of malloc and/or operator new are almost certainly correct or your program would be crashing way before it currently does. So I think the bad memory allocator is the least likely tree to go barking up. I would check first for an uninitialized pointer and then bad pointer arithmetic.
As a side note, the x86 and x86_64 architectures don't have any alignment requirements. But, because of how cache lines work, and for various other reasons, it's often a good idea for performance to align your data on a boundary that's as big as the datatype being stored (i.e. a 4 byte boundary for a 32 bit int).
Most processors (not x86 and friends.. the blacksheep of the family lol) require accesses to certain elements to be aligned on multiples of bytes. I.e. if you read an integer from address 0x04 that is okay, but if you try to do the same from 0x03 you will cause an interrupt to be thrown.
This is because it's easier to implement the load/store hardware if it's always on a multiple of the data size with which you're working.
Since HP-UX runs only on RISC processors, which typically have such constraints, you should see here -> http://en.wikipedia.org/wiki/Data_structure_alignment#RISC.
Actually HP-UX has its own great forum on ITRC and some HP staff members are very helpful. I just took a look at the same topic you are asking and here are some results. For example the similar problem was caused actually by a bad input parameter. I strongly advise you first to read answers to similar question and if necessary to post your question there.
By the way it is likely that you will be asked to post results of these gdb commands:
(gdb) bt
(gdb) info reg
(gdb) disas $pc-16*8 $pc+16*4
Most of these issues are caused by multiple upstream dependencies linking to different versions of the same library.
For example, both the gnustl and stlport provide distinct implementations of the C++ standard library. If you compile and link against gnustl, while one of your dependencies was compiled and linked against stlport, then you will each have a different implementation of standard functions and classes. When your program is launched, the dynamic linker will attempt to resolve all of the exported symbols, and will discover known symbols at incorrect offsets, resulting in the BUS_ADRALN signal.