The man page for opt says: "It takes LLVM source files as input, runs the specified optimizations or analyses on it, and then outputs the optimized file or the analysis results".
My Goal: To use the inbuilt optimisation pass -dce available in opt. This pass does Dead Code Elimination
My Source file foo.c:
int foo(void)
{
int a = 24;
int b = 25; /* Assignment to dead variable -- dead code */
int c;
c = a * 4;
return c;
}
Here is what I did:
1. clang-7.0 -S -emit-llvm foo.c -o foo.ll
2. opt -dce -S foo.ll -o fooOpt.ll
What I expect : A .ll file in which the dead code (in source code with the comment) part is eliminated.
What I get: fooOpt.ll is the same as non optimised code foo.ll
I have already seen this SO answer, but I didn't get optimised code.
Am I missing something here? Can someone please guide me on the right path.
Thank you.
If you look at the .ll file generated by clang, it will contain a line like this:
attributes #0 = { noinline nounwind optnone sspstrong uwtable ...}
You should remove the optnone attribute here. Whenever a function has the optnone attribute, opt won't touch that function at all.
Now if you try again, you'll notice ... nothing. It still does not work.
This time the problem is that the code is working on memory, not registers. What we need to do is to convert the allocas to registers using -mem2reg. In fact doing this will already optimize away b, so you don't even need the -dce flag.
Related
I want to debug a code of a plain assembler project for the ATmega2560. I want to use the Microchip debugger for that purpose. The goal is to get source level debugging with all variables and functions, breakpoints etc.
I managed to create a C "stub" file with a main() function that calls the assembler code.
extern int foo(int a);
int main(void)
{
int a = 0;
while (1)
{
a = foo(a);
}
}
But the assembler code also includes the interrupt vector table including the reset vector.
.extern main
.section .vectors
.global RESET_
RESET_: jmp WARM_0
.section code
.global foo
foo:
ret
WARM_0:
call main
ret
.end
Now I want to run the code from the label RESET_. The linker has placed the code in the section .vectors. That's okay so far, but the vector table from the GCC startup files are in that section before the vector table in my code. The GCC startup code must be removed to get my vector to the address 0. Therefore I activate the linker option "Do not use standard start files (-nostartfiles)". That gives the desired result: The reset vector has a jump to RESET_.
But this has the important side effect, that the debugger is not able to debug at source code level anymore. The C file with the main() function is still linked. But the source level debug support is lost.
How can I debug a plan assembler project with Microchip debugger/simulator for AVR8?
Remark: The code in the assembler file is not sufficient for a valid program. It's reduced to get a minimal example that should work in the Microchip environment.
I'm writing a programming language compiler to integrate DSLs and C/C++. For that I have decided for LLVM for a couple of reasons.
There is a main program. In this main program I load bitcode files, which were compiled by clang. The loadable bitcode file represents a short, but complete programming language environment with REPL, parser, linker and AST.
My understanding so far was that boolean datatypes are represented in IR as i1. I have optimized my code with -O3 and I get for a boolean following IR code (by disassembling with llvm-dis from the generated bitcode file):
%"class.tl::contrib::toy::ToyREPL" = type <{ %"class.tl::contrib::toy::InitLanguage"*, i8, [7 x i8] }>
The class is ToyREPL and it is using another class InitLanguage. Oddly, the boolean seems to be presented by an i8 and an array of i8. I don't really get it.
I have defined a Makefile. First I compile the files. Afterwards I link them to a bc file, then optimize and link it with some other libs.
#cd $(BIN)/$(TARGET)/$(2); $(LINK) -o $(1).$(BITCODE_EXT) $(3)
#cd $(BIN)/$(TARGET)/$(2); $(OPT) -O3 $(1).$(BITCODE_EXT) -o $(1).$(OPT_NAME).$(BITCODE_EXT) $(OPTIMIZER_FLAGS)
#$(LINK) -o $(BIN)/$(TARGET)/$(2)/$(1).$(BITCODE_EXT) $(BIN)/$(TARGET)/$(2)/$(1).$(OPT_NAME).bc $(LINK_OPTION) $(4)
Compiler flags are:
-v -g -emit-llvm -I$(BOOST_INC_DIR) -std=c++11 -D__STDC_CONSTANT_MACROS -D__STDC_LIMIT_MACROS
Optimizer flags are -std-link-opts
Link flag is -v.
The relevant part of the Class ToyREPL is here:
class ToyREPL {
private:
InitLanguage *initLang;
bool runs = false;
Now my question: Is my assumption wrong bool should be bitcode compiled to i1? What kind of compiler switch I need to consider to compile to i1? Let me know if you think my build process is wrong in some way. The generate bitcode file is readable and I can retrieve the module and the class ToyREPL as a StructType.
If I understand you correctly, your question is essentially - why was the C++ class
class ToyREPL {
bool runs = false;
...
};
Compiled by Clang into type <{ i8, [7 x i8], ... }>?
So first of all, why Clang chose i8 over i1 for a boolean field is straightforward - the smallest C++ type takes one byte of memory, and unless you use bit-fields, that also applies to fields in structs. Also see this related question about why a whole byte is used for booleans. LLVM itself uses i1 for boolean values, but that's because it's roughly platform-independent - in the lowering phase those might become whole bytes again.
As for [7 x i8], that's padding, made to ensure every object of this type is 64-bit aligned and does not share its memory with any other object - very reasonable approach on a 64-bit system. Alternatively, if there's a following struct field, the padding might have been inserted to ensure that field is 64-bit aligned.
The Wikipedia article on alignment and padding is a useful starting point if you want to know more.
I want to get the line number of an instruction (and also of a variable declaration - alloca and global). The instruction is saved in an array of instructions. I have the function:
Constant* metadata::getLineNumber(Instruction* I){
if (MDNode *N = I->getMetadata("dbg")) { // this if is never executed
DILocation Loc(N);
unsigned Line = Loc.getLineNumber();
return ConstantInt::get(Type::getInt32Ty(I->getContext()), Line);
} // else {
// return NULL; }
}
and in my main() I have :
errs()<<"\nLine number is "<<*metadata::getLineNumber(allocas[p]);
the result is NULL since I->getMetadata("dbg") is false.
Is there a possibility to enable dbg flags in LLVM without rebuilding the LLVM framework, like using a flag when compiling the target program or when running my pass (I used -debug) ?
Compiling a program with ā-O3 -gā should give full debug information, but I still have the same result. I am aware of http://llvm.org/docs/SourceLevelDebugging.html , from where I can see that is quite easy to take the source line number from a metadata field.
PS: for Allocas, it seems that I have to use findDbgDeclare method from DbgInfoPrinter.cpp.
Thank you in advance !
LLVM provides debugging information if you specify the -g flag to Clang. You don't need to rebuild LLVM to enable/disable it - any LLVM will do (including a pre-built one from binaries or binary packages).
The problem may be that you're trying to have debug information in highly optimized code (-O3). This is not necessarily possible, since LLVM simply optimizes some code away in such cases and there's not much meaning to debug information. LLVM tries to preserve debug info during optimizations, but it's not an easy task.
Start by generating unoptimized code with debug info (-O0 -g) and write your code/passes to work with that. Then graduate to optimized code, and try to examine what specifically gets lost. If you think that LLVM is being stupid, don't hesitate to open a bug.
Some random tips:
Generate IR from clang (-emit-llvm) and see the debug metadata nodes in it. Then you can run through opt with optimizations and see what remains.
The -debug option to llc and other LLVM tools is quite unrelated to debug info in the source.
I am optimizing some hotspots in my application and compilation is done using gcc-arm.
Now, is there any chance that the following statements result in different assembler code:
static const pixel_t roundedwhite = 4294572537U;
return (packed >= roundedwhite) ? purewhite : packed;
// OR
const pixel_t roundedwhite = 4294572537U;
return (packed >= roundedwhite) ? purewhite : packed;
// OR
return (packed >= 4294572537U) ? purewhite : packed;
Is there any chance that my ARM compiler might produce the unwanted code for the first case or should this get optimized anyway?
I assume that it's pretty the same, but, unfortunately, I am not that sure in what gcc-arm does compared to ordinary gcc and I can't access the disassembly listing.
Thank you very much.
Call gcc with the -S flag and take a look at the assembly:
-S
Stop after the stage of compilation proper; do not assemble. The output is in the form of an assembler code file for each non-assembler input file specified.
I would it out try myself to include in the answer, but I don't have an ARM compiler handy.
One difference is surely that the first version, with static will use up some memory, even if it the value will get inlined in the expression. This would make sense if you want to calculate a more complex expression once and then store the result, but for this simple constant the static is unnecessary. That said, the compiler will very likely inline the value, as this is a very simple optimization and there is no reason for it not to.
I'll state off the bat that I'm not a programmer, and am probably in over my head.
I'm trying to track down a bug in in the __strlen_sse2 (assembly) function installed in Debian as part of libc6-i686.
I already have a copy of the assembly code (.S file) and I need to figure out a way to call it from a C/C++ program. How can I achieve this?
edit:
Tried this code, but I get an error from gcc about undefined reference to '__strlen_sse2'
edit 2: It's my understanding that this is the proper answer to the question, but I lack the proper knowledge to carry it to completion. Thanks for the help everyone.
#include <stdio.h>
#include <string.h>
size_t __strlen_sse2(const char *);
void main()
{
char buffer[255];
printf("Standby.. ");
gets(buffer);
__strlen_sse2("CRASH!");
printf("OK!\n");
}
Like I said... not a programmer.
I hope my question makes sense. Please let me know if you need any more information.
You can't directly call the copy of __strlen_sse2 inside the usual, dynamically-linked /lib/libc.so.6 because it is a "hidden symbol" -- accessible to code inside libc itself, but not available for external linking.
You say you have the .S file that defines __strlen_sse2, from glibc's source code, but you're going to need to modify it to be buildable outside glibc. I found what I think is the right file, and was able to modify it pretty easily. Delete everything up to but not including the line that reads just
.text
and replace it with this:
#define PUSH(REG) pushl REG
#define POP(REG) popl REG
#define PARMS 4
#define STR PARMS
#define ENTRANCE
#define RETURN ret
#define L(x) .L##x
#define ENTRY(x) .globl x; .type x,#function; x:
#define END(x) .size x, .-x
Also delete the #endif line at the very end of the file. Then compile like so:
gcc -m32 -c strlen-sse2.S
gcc -m32 -c test.c
gcc -m32 test.o strlen-sse2.o
./a.out
You may not need the -m32s.
You might be able to get some help with your larger problem from superuser.com -- provide the contents of /proc/cpuinfo in your question.
Just declare the function, preferably with a full prototype, and call it. This is probably the right prototype:
size_t __strlen_sse2(const char *);
I'm a bit skeptical of your claim that you're trying to track down a bug in this function. It's more likely there's a bug in your program that's calling strlen with an invalid argument (either an invalid pointer or a pointer to something other than a null-terminated string).