I am working with LLVM 3.4 and want to obtain the line number information of source file from IR. The IR is generated from simple c code with Clang. I want to obtain the line number in source c file from the line in IR body.
I tried this -
For Instruction BI,
unsigned Line = Line = BI->getDebugLoc().getLine();
For Loop L, std::cout << L->getStartLoc().getLine();
But, the result stored/printed is always 0. I don't know how to obtain line number in the source from LLVM IR.
My Source C file is -
#include <stdio.h>
int main()
{
int i;
int inbuf[100];
int outbuf[100];
for(i = 0; i < 100; ++i)
inbuf[i] ^= outbuf[i];
inbuf[1] += 402;
inbuf[6] += 107;
inbuf[97] += 231;
for(i = 0; i < 100; ++i)
{
inbuf[i] += outbuf[i];
}
inbuf[47] += 312;
//print-statements
for (i=0;i<100;i++) {
printf("inbuf[%d] = %d\n",i,inbuf[i]);
}
return 0;
Command Used-
~/llvm/build/Release+Asserts/bin/clang -O3 -fno-unroll-loops -fno-vectorize -fno-slp-vectorize -S -emit-llvm sample.c -o sample.ll
Thanks!
To get line number information into .ll file your must specify both -O0 and -g flags for clang.
http://llvm.org/docs/SourceLevelDebugging.html#debugging-optimized-code
Line numbers are stored in specialized metadata nodes.
http://llvm.org/docs/LangRef.html#specialized-metadata-nodes
So the full command line must look like this:
~/llvm/build/Release+Asserts/bin/clang -O0 -g -S -emit-llvm sample.c -o sample.ll
Related
In the following example, the elimination of unused code is performed for sin() but not for pow(). I was wondering why. Tried gcc and clang.
Here is some more details about this example, which is otherwise mostly code.
The code contains a loop over an integer from which a floating point number is computed.
The number is passed to a mathematical function: either pow() or sin()
depending on which macros are defined.
If macro USE is defined, the sum of all returned values is accumulated in another variable which is then copied to a volatile variable to prevent the optimizer from removing the code entirely.
// main.cpp
#include <chrono>
#include <cmath>
#include <cstdio>
int main() {
std::chrono::steady_clock clock;
auto start = clock.now();
double s = 0;
const size_t count = 1 << 27;
for (size_t i = 0; i < count; ++i) {
const double x = double(i) / count;
double a = 0;
#ifdef POW
a = std::pow(x, 0.5);
#endif
#ifdef SIN
a = std::sin(x);
#endif
#ifdef USE
s += a;
#endif
}
auto stop = clock.now();
printf(
"%.0f ms\n", std::chrono::duration<double>(stop - start).count() * 1e3);
volatile double a = s;
(void)a;
}
As seen from the output, the computation of sin() is completely eliminated if the results are unused. This is not the case for pow() since the execution time does not decrease.
I normally observe this if the call may return a NaN (log(-x) but not log(+x)).
# g++ 10.2.0
g++ -std=c++14 -O3 -DPOW main.cpp -o main && ./main
3064 ms
g++ -std=c++14 -O3 -DPOW -DUSE main.cpp -o main && ./main
3172 ms
g++ -std=c++14 -O3 -DSIN main.cpp -o main && ./main
0 ms
g++ -std=c++14 -O3 -DSIN -DUSE main.cpp -o main && ./main
1391 ms
# clang++ 11.0.1
clang++ -std=c++14 -O3 -DPOW main.cpp -o main && ./main
3288 ms
clang++ -std=c++14 -O3 -DPOW -DUSE main.cpp -o main && ./main
3351 ms
clang++ -std=c++14 -O3 -DSIN main.cpp -o main && ./main
177 ms
clang++ -std=c++14 -O3 -DSIN -DUSE main.cpp -o main && ./main
1524 ms
Q: Is it possible to improve IO of this code with LLVM Clang under OS X:
test_io.cpp:
#include <iostream>
#include <string>
constexpr int SIZE = 1000*1000;
int main(int argc, const char * argv[]) {
std::ios_base::sync_with_stdio(false);
std::cin.tie(nullptr);
std::string command(argv[1]);
if (command == "gen") {
for (int i = 0; i < SIZE; ++i) {
std::cout << 1000*1000*1000 << " ";
}
} else if (command == "read") {
int x;
for (int i = 0; i < SIZE; ++i) {
std::cin >> x;
}
}
}
Compile:
clang++ -x c++ -lstdc++ -std=c++11 -O2 test_io.cpp -o test_io
Benchmark:
> time ./test_io gen | ./test_io read
real 0m2.961s
user 0m3.675s
sys 0m0.012s
Apart from the sad fact that reading of 10MB file costs 3 seconds, it's much slower than g++ (installed via homebrew):
> gcc-6 -x c++ -lstdc++ -std=c++11 -O2 test_io.cpp -o test_io
> time ./test_io gen | ./test_io read
real 0m0.149s
user 0m0.167s
sys 0m0.040s
My clang version is Apple LLVM version 7.0.0 (clang-700.0.72). clangs installed from homebrew (3.7 and 3.8) also produce slow io. clang installed on Ubuntu (3.8) generates fast io. Apple LLVM version 8.0.0 generates slow io (2 people asked).
I also dtrussed it a bit (sudo dtruss -c "./test_io gen | ./test_io read") and found that clang version makes 2686 write_nocancel syscalls, while gcc version makes 2079 writev syscalls. Which probably points to the root of the problem.
The issue is in libc++ that does not implement sync_with_stdio.
Your command line clang++ -x c++ -lstdc++ -std=c++11 -O2 test_io.cpp -o test_io does not use libstdc++, it will use libc++. To force use libstdc++ you need -stdlib=libstdc++.
Minimal example if you have the input file ready:
int main(int argc, const char * argv[]) {
std::ios_base::sync_with_stdio(false);
int x;
for (int i = 0; i < SIZE; ++i) {
std::cin >> x;
}
}
Timings:
$ clang++ test_io.cpp -o test -O2 -std=c++11
$ time ./test read < input
real 0m2.802s
user 0m2.780s
sys 0m0.015s
$ clang++ test_io.cpp -o test -O2 -std=c++11 -stdlib=libstdc++
clang: warning: libstdc++ is deprecated; move to libc++
$ time ./test read < input
real 0m0.185s
user 0m0.169s
sys 0m0.012s
I have a program that does independent computations on a bunch of images. This seems like a good idea to use OpenMP:
//file: WoodhamData.cpp
#include <omp.h>
...
void WoodhamData::GenerateLightingDirection() {
int imageWidth = (this->normalMap)->width();
int imageHeight = (this->normalMap)->height();
#pragma omp paralell for num_threads(2)
for (int r = 0; r < RadianceMaps.size(); r++) {
if (omp_get_thread_num() == 0){
std::cout<<"threads="<<omp_get_num_threads()<<std::endl;
}
...
}
}
In order to use OpenMP, I add -fopenmp to my makefile, so it outputs:
g++ -g -o test.exe src/test.cpp src/WoodhamData.cpp -pthread -L/usr/X11R6/lib -fopenmp --std=c++0x -lm -lX11 -Ilib/eigen/ -Ilib/CImg
However, I am sad to say, my program reports threads=1 (run from terminal ./test.exe ...)
Does anyone know what might be wrong? This is the slowest part of my program, and it would be great to speed it up a bit.
Your OpenMP directive is wrong - it is "parallel" not "paralell".
This question already has answers here:
What's the difference between -O3 and (-O2 + flags that man gcc says -O3 adds to -O2)?
(2 answers)
Closed 8 years ago.
Here's the function I'm looking at:
template <uint8_t Size>
inline uint64_t parseUnsigned( const char (&buf)[Size] )
{
uint64_t val = 0;
for (uint8_t i = 0; i < Size; ++i)
if (buf[i] != ' ')
val = (val * 10) + (buf[i] - '0');
return val;
}
I have a test harness which passes in all possible numbers with Size=5, left-padded with spaces. I'm using GCC 4.7.2. When I run the program under callgrind after compiling with -O3 I get:
I refs: 7,154,919
When I compile with -O2 I get:
I refs: 9,001,570
OK, so -O3 improves the performance (and I confirmed that some of the improvement comes from the above function, not just the test harness). But I don't want to completely switch from -O2 to -O3, I want to find out which specific option(s) to add. So I consult man g++ to get the list of options it says are added by -O3:
-fgcse-after-reload [enabled]
-finline-functions [enabled]
-fipa-cp-clone [enabled]
-fpredictive-commoning [enabled]
-ftree-loop-distribute-patterns [enabled]
-ftree-vectorize [enabled]
-funswitch-loops [enabled]
So I compile again with -O2 followed by all of the above options. But this gives me even worse performance than plain -O2:
I refs: 9,546,017
I discovered that adding -ftree-vectorize to -O2 is responsible for this performance degradation. But I can't figure out how to match the -O3 performance with any combination of options. How can I do this?
In case you want to try it yourself, here's the test harness (put the above parseUnsigned() definition under the #includes):
#include <cmath>
#include <stdint.h>
#include <cstdio>
#include <cstdlib>
#include <cstring>
template <uint8_t Size>
inline void increment( char (&buf)[Size] )
{
for (uint8_t i = Size - 1; i < 255; --i)
{
if (buf[i] == ' ')
{
buf[i] = '1';
break;
}
++buf[i];
if (buf[i] > '9')
buf[i] -= 10;
else
break;
}
}
int main()
{
char str[5];
memset(str, ' ', sizeof(str));
unsigned max = std::pow(10, sizeof(str));
for (unsigned ii = 0; ii < max; ++ii)
{
uint64_t result = parseUnsigned(str);
if (result != ii)
{
printf("parseUnsigned(%*s) from %u: %lu\n", sizeof(str), str, ii, result);
abort();
}
increment(str);
}
}
A very similar question was already answered here: https://stackoverflow.com/a/6454659/483486
I've copied the relevant text underneath.
UPDATE: There are questions about it in GCC WIKI:
"Is -O1 (-O2,-O3 or -Os) equivalent to individual -foptimization options?"
No. First, individual optimization options (-f*) do not enable optimization, an option -Os or -Ox with x > 0 is required. Second, the -Ox flags enable many optimizations that are not controlled by any individual -f* option. There are no plans to add individual options for controlling all these optimizations.
"What specific flags are enabled by -O1 (-O2, -O3 or -Os)?"
Varies by platform and GCC version. You can get GCC to tell you what flags it enables by doing this:
touch empty.c
gcc -O1 -S -fverbose-asm empty.c
cat empty.s
I am trying to use Octave functions in C++. I install Octave-3.8.0 on Mac OS X 10.9.3 and follow the standalone program example on Octave website,
#include <iostream>
#include <octave/oct.h>
int
main (void)
{
std::cout << "Hello Octave world!\n";
int n = 2;
Matrix a_matrix = Matrix (n, n);
for (octave_idx_type i = 0; i < n; i++)
for (octave_idx_type j = 0; j < n; j++)
a_matrix(i,j) = (i + 1) * 10 + (j + 1);
std::cout << a_matrix;
return 0;
}
Then I type
$ mkoctfile --link-stand-alone main.cpp -o standalone
But it shows mkoctfile: command not found. What is the problem?
I also tried to compile the C++ file with g++
$ g++ -I /usr/local/octave/3.8.0/include/octave-3.8.0 main.cpp
but it shows 2 errors as follows.
1) 'config.h' file not found with include; use "quotes" instead.
2) fatal error: 'hdft.h' file not found.
Please help me!
It may be that octave is not registered in your system properly, judging by the shell response.
Try invoking the command from inside the octave interpreter.
Not sure about on MacOS, but on Linux, mkoctfile is not bundled within the default Octave distribution. Instead, it requires a supplementary package, liboctave-dev, that has to be installed in addition to Octave itself.
This is not documented in the Octave web tutorial.