Illegal BitCast while compiling opencl kernel - llvm

The below bitcast instruction is throwing me an Illegal Bitcast error, can someone point what the problem is?
%opencl.image1d_ro_t = type opaque
%struct.dev_image_t = type { i8*, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32 }
%astype = bitcast %opencl.image1d_ro_t addrspace(1)* %image to %struct.dev_image_t*

You're casting from address space 1 to the default address space, 0. That won't work, as the documentation says. Each address space is independent.
Address spaces are meant for things like programs that have some garbage-collected and some manually managed memory. The pointer points to two profoundly different kinds of memory.

Related

How can I call LLVM IR arg with only one arg?

In LLVM IR, if I define printf as a single arg func, I'm able to use it. However, if I define it as vararg, it gives an error:
#msg = constant [13 x i8] c"hello world\0A\00"
declare i32 #printf(i8*) ; works
;declare i32 #printf(i8*, ...) ; error: '#printf' defined with type 'i32 (i8*, ...)*'
; call i32 #printf(i8* %msg)
define i32 #main () {
%msg = getelementptr [13 x i8]* #msg, i64 0, i64 0
call i32 #printf(i8* %msg)
ret i32 0
}
How do I tell LLVM IR that printf is vararg, but call it with only one argument?
Note this passage from the description of the call instruction in the LLVM Language Reference (emphasis mine):
'fnty': shall be the signature of the function being called. The argument types must match the types implied by this signature. This type can be omitted if the function is not varargs.
So if the function is variadic, you do need to provide the function type as part of the call instruction.

How are c++ class instance members handed at the machine level?

I understand the basic layout (given a typical c++ implementation) of class instance members, but say you have MyClass with int num as a member, and you create an instance of it, how is the specific address of the member in memory handled at run time?
I'll be clearer with an example:
class MyClass
{
int num;
int num2;
int num3;
public:
void setNum(); //always sets num to 10
};
Then you call setnum, how does it know what memory to set to 10?
The memory layout for MyClass might look like
class MyClass size(12):
+---
0 | num
4 | num1
8 | num2
+---
So is it as simple as when setNum gets called with the hidden pointer to your instance of myclass for member access it gets written based on offest? forexample myclasspointer+4?
EDIT clarification how does it decide where to write to? failed copypaste left the vftable in there. I totally imagine its gonna just be a known offset right?
Or is it something more complex?
Apologizing for unclear terminallogy I rarely know how to phrase a question right...
The compiler will know the contents of the class (or struct), and most importantly the offsets of the different member variables. the setNum function is given a this pointer as a "hidden" argument, and the compiler will take the this variable and add the offset for num.
Exactly how this happens depends on the compiler. In LLVM, it would use a getelementptr VM instruction, which understands the structure and, given a base-address, adds the offset given by the index. This will then translate to some sort of instruction that takes the this and either a direct offset addition in a single instruction, or two instructions to load the pointer and then add the offset - depending a little bit on the architecture, and what the next instructions "need".
Since the num member is the first member in the struct, it will be zero, so on x86-64, compiled with clang++ -O1, we get this disassembly:
_ZN7MyClass6setNumEv: # #_ZN7MyClass6setNumEv
movl $10, (%rdi)
retq
In other words, move the number 10 into the address of this (in %rdi - first argument on a Linux machine).
The LLVM IR shows better what goes on:
%class.MyClass = type { i32, i32, i32 }
; Function Attrs: nounwind uwtable
define void #_ZN7MyClass6setNumEv(%class.MyClass* nocapture %this) #0 align 2 {
entry:
%num = getelementptr inbounds %class.MyClass* %this, i64 0, i32 0
store i32 10, i32* %num, align 4, !tbaa !1
ret void
}
The class contains 3 i32 (32 bit integers), and the function takes a this pointer, it then uses getelementptr to get the first element (element 0). Yes, there's one more argument than you'd expect. That's how LLVM works ;)
Then a store instruction for the value 10 into the %num calculated address.
If we change the code in setNum so that it stores 10 into num2 instead, we get:
define void #_ZN7MyClass6setNumEv(%class.MyClass* nocapture %this) #0 align 2 {
entry:
%num2 = getelementptr inbounds %class.MyClass* %this, i64 0, i32 2
store i32 10, i32* %num2, align 4, !tbaa !1
ret void
}
Note the change of the last number into getelementptr.
As assembly code it becomes:
_ZN7MyClass6setNumEv: # #_ZN7MyClass6setNumEv
movl $10, 8(%rdi)
retq
(As it currently stands, in Revision 2 of the original question, your class MyClass has a size of 12, 3 * 4 bytes, not 8 like your text says).

__asm__ gcc call to a memory address

I have a code that allocates memory, copies some buffer to that allocated memory and then it jumps to that memory address.
the problem is that I cant jump to the memory address. Im using gcc and __asm__ but I cant call that memory address.
I want to do something like:
address=VirtualAlloc(NULL,len+1, MEM_COMMIT, PAGE_EXECUTE_READWRITE);
dest=strncpy(address, buf, len);
And then I want to do this in ASM:
MOV EAX, dest
CALL EAX.
I've tried something like:
__asm__("movl %eax, dest\n\t"
"call %eax\n\t");
But it does not work.
How can I do it?
There is usually no need to use asm for this, you can simply go through a function pointer and let the compiler take care of the details.
You do need to use __builtin___clear_cache(buf, buf+len) after copy machine code to a buffer before you dereference a function-pointer to it, otherwise it can be optimized away as a dead store.. x86 has coherent instruction caches so it doesn't compile to any extra instructions, but you still need it so the optimizer knows what's going on.
static inline
int func(char *dest, int len) {
__builtin___clear_cache(dest, dest+len); // no instructions on x86 but still needed
int ret = ((int (*)(void))dest)(); // cast to function pointer and deref
return ret;
}
compiles with GCC9.1 -O2 -m32 to
func(char*, int):
jmp [DWORD PTR [esp+4]] # tailcall
Also, you don't actually need to copy a string, you can just mprotect or VirtualProtect the page it's in to make it executable. But if you want to make sure it does stop at the first 0 byte to test your shellcode, then sure copy it.
If you nevertheless insist on inline asm, you should know that gcc inline asm is a complex thing. Also, if you expect the function to return, you should really make sure it follows the calling convention, in particular it preserves the registers it should.
AT&T syntax is op src, dst so your mov was actually a store to the global symbol dest.
That said, here is the answer to the question as worded:
int ret;
__asm__ __volatile__ ("call *%0" : "=a" (ret) : "0" (dest) : "ecx", "edx", "memory");
Explanation: https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html
call *%0 = the %0 refers to the first substitued argument, the * is standard gas syntax for indirect call
"=a" (ret) = output argument in eax register should be assigned to variable ret after the block
"0" (dest) = input argument in the same place as output argument 0 (which is eax) should be loaded from dest before the block
"ecx", "edx" = tell the compiler these registers may be altered by the asm block, as per normal calling convention.
"memory" = tell the compiler the asm block might make unspecified modifications to memory, so don't cache anything
Note that in x86-64 System V (Linux / OS X), it's not safe to make a function call from inline asm like this. There's no way to declare a clobber on the red zone below RSP.

x86 logical address syntax error

Compiler: gcc 4.7.1, 32bit, ubuntu
Here's an example:
int main(void)
{
unsigned int mem = 0;
__asm volatile
(
"mov ebx, esp\n\t"
"mov %0, [ds : ebx]\n\t"
: "=m"(mem)
);
printf("mem = 0x%08x\n", mem);
return 0;
}
gcc -masm=intel -o app main.c
Assembler messages: invalid use of register!
As I know, ds and ss point to the same segment. I don't know why I can't use [ds : ebx] logical address for addressing.
Your code has two problems:
One: the indirect memory reference should be:
mov %0, ds : [ebx]
That is, with the ds out of the brackets.
Two: A single instruction cannot have both origin and destination in memory, you have to use a register. The easiest way would be to indicate =g that basically means whatever, but in your case it is not possible because esp cannot be moved directly to memory. You have to use =r.
Three: (?) You are clobbering the ebx register, so you should declare it as such, or else do not use it that way. That will not prevent compilation, but will make your code to behave erratically.
In short:
unsigned int mem = 0;
__asm volatile
(
"mov ebx, esp\n\t"
"mov %0, ds : [ebx]\n\t"
: "=r"(mem) :: "ebx"
);
Or better not to force to use ebx, let instead the compiler decide:
unsigned int mem = 0, temp;
__asm volatile
(
"mov %1, esp\n\t"
"mov %0, ds : [%1]\n\t"
: "=r"(mem) : "r"(temp)
);
BTW, you don't need the volatile keyword in this code. That is used to avoid the assembler to be optimized away even if the output is not needed. If you write the code for the side-effect, add volatile, but if you write the code to get an output, do not add volatile. That way, if the optimizing compiler determines that the output is not needed, it will remove the whole block.

storing structs in ROM on ARM device

I have some constant data that I want to store in ROM since there is a fair amount of it and I'm working with a memory-constrained ARM7 embedded device. I'm trying to do this using structures that look something like this:
struct objdef
{
int x;
int y;
bool (*function_ptr)(int);
some_other_struct * const struct_array; // array of similar structures
const void* vp; // previously ommittted to shorten code
}
which I then create and initialize as globals:
const objdef def_instance = { 2, 3, function, array, NULL };
However, this eats up quite a bit of RAM despite the const at the beginning. More specifically, it significantly increases the amount of RW data and eventually causes the device to lock up if enough instances are created.
I'm using uVision and the ARM compiler, along with the RTX real-time kernel.
Does anybody know why this doesn't work or know a better way to store structured heterogenous data in ROM?
Update
Thank you all for your answers and my apologies for not getting back to you guys earlier. So here is the score so far and some additional observations on my part.
Sadly, __attribute__ has zero effect on RAM vs ROM and the same goes for static const. I haven't had time to try the assembly route yet.
My coworkers and I have discovered some more unusual behavior, though.
First, I must note that for the sake of simplicity I did not mention that my objdef structure contains a const void* field. The field is sometimes assigned a value from a string table defined as
char const * const string_table [ROWS][COLS] =
{
{ "row1_1", "row1_2", "row1_3" },
{ "row2_1", "row2_2", "row2_3" },
...
}
const objdef def_instance = { 2, 3, function, array, NULL };//->ROM
const objdef def_instance = { 2, 3, function, array, string_table[0][0] };//->RAM
string_table is in ROM as expected. And here's the kicker: instances of objdef get put in ROM until one of the values in string_table is assigned to that const void* field. After that the struct instance is moved to RAM.
But when string_table is changed to
char const string_table [ROWS][COLS][MAX_CHARS] =
{
{ "row1_1", "row1_2", "row1_3" },
{ "row2_1", "row2_2", "row2_3" },
...
}
const objdef def_instance = { 2, 3,function, array, NULL };//->ROM
const objdef def_instance = { 2, 3, function, array, string_table[0][0] };//->ROM
those instances of objdef are placed in ROM despite that const void* assigment. I have no idea why this should matter.
I'm beginning to suspect that Dan is right and that our configuration is messed up somewhere.
I assume you have a scatterfile that separates your RAM and ROM sections. What you want to do is to specify your structure with an attribute for what section it will be placed, or to place this in its own object file and then specify that in the section you want it to be in the scatterfile.
__attribute__((section("ROM"))) const objdef def_instance = { 2, 3, function, array };
The C "const" keyword doesn't really cause the compiler to put something in the text or const section. It only allows the compiler to warn you of attempts to modify it. It's perfectly valid to get a pointer to a const object, cast it to a non-const, and write to it, and the compiler needs to support that.
Your thinking is correct and reasonable. I've used Keil / uVision (this was v3, maybe 3 years ago?) and it always worked how you expected it to, i.e. it put const data in flash/ROM.
I'd suspect your linker configuration / script. I'll try to go back to my old work & see how I had it configured. I didn't have to add #pragma or __attribute__ directives, I just had it place .const & .text in flash/ROM. I set up the linker configuration / memory map quite a while ago, so unfortunately, my recall isn't very fresh.
(I'm a bit confused by people who are talking about casting & const pointers, etc... You didn't ask anything about that & you seem to understand how "const" works. You want to place the initialized data in flash/ROM to save RAM (not ROM->RAM copy at startup), not to mention a slight speedup at boot time, right? You're not asking if it's possible to change it or whatever...)
EDIT / UPDATE:
I just noticed the last field in your (const) struct is a some_other_struct * const (constant pointer to a some_other_struct). You might want to try making it a (constant) pointer to a constant some_other_struct [some_other_struct const * const] (assuming what it points to is indeed constant). In that case it might just work. I don't remember the specifics (see a theme here?), but this is starting to seem familiar. Even if your pointer target isn't a const item, and you can't eventually do this, try changing the struct definition & initializing it w/ a pointer to const and just see if that drops it into ROM. Even though you have it as a const pointer and it can't change once the structure is built, I seem to remember something where if the target isn't also const, the linker doesn't think it can be fully initialized at link time & defers the initialization to when the C runtime startup code is executed, incl. the ROM to RAM copy of initialized RW memory.
You could always try using assembly language.
Put in the information using DATA statements and publish (make public) the starting addresses of the data.
In my experience, large Read-Only data was declared in a source file as static const. A simple global function inside the source file would return the address of the data.
If you are doing stuff on ARM you are probably using the ELF binary format. ELF files contain an number of sections but constant data should find its way into .rodata or .text sections of the ELF binary. You should be able to check this with the GNU utility readelf or the RVCT utility fromelf.
Now assuming you symbols find themselves in the correct part of the elf file, you now need to find out how the RTX loader does its job. There is also no reason why the instances cannot share the same read only memory but this will depend on the loader. If the executable is stored in the rom, it may be run in-place but may still be loaded into RAM. This also depends on the loader.
A complete example would have been best. If I take something like this:
typedef struct
{
char a;
char b;
} some_other_struct;
struct objdef
{
int x;
int y;
const some_other_struct * struct_array;
};
typedef struct
{
int x;
int y;
const some_other_struct * struct_array;
} tobjdef;
const some_other_struct def_other = {4,5};
const struct objdef def_instance = { 2, 3, &def_other};
const tobjdef tdef_instance = { 2, 3, &def_other};
unsigned int read_write=7;
And compile it with the latest codesourcery lite
arm-none-linux-gnueabi-gcc -S struct.c
I get
.arch armv5te
.fpu softvfp
.eabi_attribute 20, 1
.eabi_attribute 21, 1
.eabi_attribute 23, 3
.eabi_attribute 24, 1
.eabi_attribute 25, 1
.eabi_attribute 26, 2
.eabi_attribute 30, 6
.eabi_attribute 18, 4
.file "struct.c"
.global def_other
.section .rodata
.align 2
.type def_other, %object
.size def_other, 2
def_other:
.byte 4
.byte 5
.global def_instance
.align 2
.type def_instance, %object
.size def_instance, 12
def_instance:
.word 2
.word 3
.word def_other
.global tdef_instance
.align 2
.type tdef_instance, %object
.size tdef_instance, 12
tdef_instance:
.word 2
.word 3
.word def_other
.global read_write
.data
.align 2
.type read_write, %object
.size read_write, 4
read_write:
.word 7
.ident "GCC: (Sourcery G++ Lite 2010.09-50) 4.5.1"
.section .note.GNU-stack,"",%progbits
With the section marked as .rodata, which I would assume is desired. Then it is up to the linker script to make sure that ro data is put in rom. And note the read_write variable is after switching from .rodata to .data which is read/write.
So to make this a complete binary and see if it gets placed in rom or ram (.text or .data) then
start.s
.globl _start
_start:
b reset
b hang
b hang
b hang
b hang
b hang
b hang
b hang
reset:
hang: b hang
Then
# arm-none-linux-gnueabi-gcc -c -o struct.o struct.c
# arm-none-linux-gnueabi-as -o start.o start.s
# arm-none-linux-gnueabi-ld -Ttext=0 -Tdata=0x1000 start.o struct.o -o struct.elf
# arm-none-linux-gnueabi-objdump -D struct.elf > struct.list
And we get
Disassembly of section .text:
00000000 <_start>:
0: ea000006 b 20 <reset>
4: ea000008 b 2c <hang>
8: ea000007 b 2c <hang>
c: ea000006 b 2c <hang>
10: ea000005 b 2c <hang>
14: ea000004 b 2c <hang>
18: ea000003 b 2c <hang>
1c: ea000002 b 2c <hang>
00000020 <reset>:
20: e59f0008 ldr r0, [pc, #8] ; 30 <hang+0x4>
24: e5901000 ldr r1, [r0]
28: e5801000 str r1, [r0]
0000002c <hang>:
2c: eafffffe b 2c <hang>
30: 00001000 andeq r1, r0, r0
Disassembly of section .data:
00001000 <read_write>:
1000: 00000007 andeq r0, r0, r7
Disassembly of section .rodata:
00000034 <def_other>:
34: 00000504 andeq r0, r0, r4, lsl #10
00000038 <def_instance>:
38: 00000002 andeq r0, r0, r2
3c: 00000003 andeq r0, r0, r3
40: 00000034 andeq r0, r0, r4, lsr r0
00000044 <tdef_instance>:
44: 00000002 andeq r0, r0, r2
48: 00000003 andeq r0, r0, r3
4c: 00000034 andeq r0, r0, r4, lsr r0
And that achieved the desired result. The read_write variable is in ram, the structs are in the rom. Need to make sure both the const declarations are in the right places, the compiler gives no warnings about say putting a const on some pointer to another structure that it may not determine at compile time as being a const, and even with all of that getting the linker script (if you use one) to work as desired can take some effort. for example this one seems to work:
MEMORY
{
bob(RX) : ORIGIN = 0x0000000, LENGTH = 0x8000
ted(WAIL) : ORIGIN = 0x2000000, LENGTH = 0x8000
}
SECTIONS
{
.text : { *(.text*) } > bob
.data : { *(.data*) } > ted
}