Differentiate between float and double in LLVM IR

Differentiate between float and double in LLVM IR - llvm

i am trying to extract constants from the LLVM IR for further analysis. So far i have been able to extract ints, floats and doubles by using the appropriate methods.
How can i differentiate between floats and doubles before trying to get the value from the methods in the APF class. Without an appropriate check i end up triggering an assert when i invoke convertToFloat() on a double or a convertToDouble() on a float. Is there some indirect mechanism in LLVM to distinguish between the datatypes before trying to get the value?

There are several ways, the simplest one I can think of is by using the getSemantics method:
bool IsFloat = MyFloat.getSemantics() == &APFloat::IEEEsingle;
bool IsDouble = MyFloat.getSemantics() == &APFloat::IEEEdouble;
By the way, it's more common to just check the type of the Value that the APFloat came from, if you have it:
bool IsFloat = MyValue.getType()->isFloatTy();
bool IsDouble = MyValue.getType()->isDoubleTy();

If you know the size of each type then you can use a substitute function for sizeof() of the C Language as described in this link:
http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt

Related

Math Parser for Complex Numbers in C (ExprTk)

I have been using the ExprTk library quite frequently in the past in order to further process large output files generated with Mathematica (containing mathematical expressions) in C.
Until now, I exclusively used this library to process expressions that yield values of the type <double>, for which the library works flawlessly by defining the types
typedef exprtk::symbol_table<double> symbol_table_t;
typedef exprtk::expression<double> expression_t;
typedef exprtk::parser<double> parser_t;
and storing "everything" in a struct
struct readInExpression
{
double a, b;
symbol_table_t symbol_table;
expression_t expression;
};
Reading in a text file that contains the variables a and b as well as e.g. the user-defined function
double my_function(double a, double b) {
return a+b;
}
can be achieved by means of
void readInFromFile(readInExpression* f, parser_t* p) {
std::string file = "xyz.txt";
std::ifstream ifs(file);
std::string content( (std::istreambuf_iterator<char>(ifs) ),
(std::istreambuf_iterator<char>() ) );
f->symbol_table.add_variable("a",f->a);
f->symbol_table.add_variable("b",f->b);
f->symbol_table.add_function("my_function",my_function);
f->expression.register_symbol_table(f->symbol_table);
p->compile(content,f->expression);
}
One may then evaluate the read-in expression for arbitrary values of a and b by using
double evaluateFile(readInExpression* f, double a, double b) {
f->a = a;
f->b = b;
return f->expression.value();
}
Recently, I ran into problems when trying to process text files that contain complex numbers and functions that return complex values of the type std::complex<double>. More specifically, I have a .txt file that contains expressions of the form
2*m*A0(m*m) + Complex(1.0,2.0)*B0(M*M,0.0,m*m)
where A0(a) and B0(a,b,c) are the scalar loop integrals that arise from the Passarino-Veltman reduction of (tensor) loop integrals in high-energy physics.
These can be evaluated numerically in C using LoopTools, where it is to be noted that they take complex values for certain values of a, b, and c. Simply replacing <double> by std::complex<double> in the typedefs above throws tons of errors when compiling. I am not sure whether the ExprTk library is able to handle complex numbers at all -- I know that it cannot deal with custom classes, but from what I understand, it should be able to handle native datatypes (as I found here, ExprTk is able to at least deal with vectors, but given the complexity of the expressions I need to process, I do not think it will be possible to somehow rewrite everything in form of vectors, in particular due to the difference in doing algebra with complex numbers and vectors). Note that neither can I split the expressions into real and imaginary part because I have to evaluate the expressions for many different values of the variables.
Although I dealt with complex numbers and the mentioned functions A0(a) and B0(a,b,c) in text files before, I solved this by simply including the .txt files in C using #include "xyz.txt", implemented in a corresponding function, which, however, seems impossible given the size of the text files at hand (the compiler throws an error if I try to do so).
Does anybody know if and how ExprTk can deal with complex numbers? (A MWE would be highly appreciated.) If that is not the case, can anyone here suggest a different math parser that is user friendly and can deal with complex numbers of the type std::complex<double>, at the same time allowing to define custom functions that themselves return such complex values?
A MWE:
/************/
/* Includes */
/************/
#include <iostream> // Input and Output on the console
#include <fstream> // Input and Output to files
#include <string> // In order to work with strings -- needed to loop through the Mathematica output files
#include "exprtk.hpp" // Parser to evaluate string expressions as mathematical/arithmetic input
#include <math.h> // Simple Math Stuff
#include <gsl/gsl_math.h> // GSL math stuff
#include <complex> // Complex Numbers
/**********/
/* Parser */
/**********/
// Type definitions for the parser
typedef exprtk::symbol_table<double> symbol_table_t; // (%)
typedef exprtk::expression<double> expression_t; // (%)
typedef exprtk::parser<double> parser_t; // (%)
/* This struct is used to store certain information of the Mathematica files in
order to later evaluate them for different variables with the parser library. */
struct readInExpression
{
double a,b; // (%)
symbol_table_t symbol_table;
// Instantiate expression
expression_t expression;
};
/* Global variable where the read-in file/parser is stored. */
readInExpression file;
parser_t parser;
/*******************/
/* Custom function */
/*******************/
double my_function(double a, double b) {
return a+b;
}
/***********************************/
/* Converting Mathematica Notation */
/***********************************/
/* Mathematica prints complex numbers as Complex(x,y), so we need a function to convert to C++ standard. */
std::complex<double> Complex(double a, double b) { // (%)
std::complex<double> c(a,b);
return c;
}
/************************************/
/* Processing the Mathematica Files */
/************************************/
double evaluateFileDoubleValuedInclude(double a, double b) {
return
#include "xyz.txt"
;
}
std::complex<double> evaluateFileComplexValuedInclude(double a, double b) {
return
#include "xyzC.txt"
;
}
void readInFromFile(readInExpression* f, parser_t* p) {
std::string file = "xyz.txt"; // (%)
std::ifstream ifs(file);
std::string content( (std::istreambuf_iterator<char>(ifs) ),
(std::istreambuf_iterator<char>() ) );
// Register variables with the symbol_table
f->symbol_table.add_variable("a",f->a);
f->symbol_table.add_variable("b",f->b);
// Add custom functions to the evaluation list (see definition above)
f->symbol_table.add_function("my_function",my_function); // (%)
// f->symbol_table.add_function("Complex",Complex); // (%)
// Register symbol_table to instantiated expression
f->expression.register_symbol_table(f->symbol_table);
// Compile the expression with the instantiate parser
p->compile(content,f->expression);
}
std::complex<double> evaluateFile(readInExpression* f, double a, double b) { // (%)
// Set the values of the struct to the input values
f->a = a;
f->b = b;
// Evaluate the result for the upper values
return f->expression.value();
}
int main() {
exprtk::symbol_table<std::complex<double> > st1; // Works
exprtk::expression<std::complex<double> > e1; // Works
// exprtk::parser<std::complex<double> > p1; // Throws an error
double a = 2.0;
double b = 3.0;
std::cout << "Evaluating the text file containing only double-valued functions via the #include method: \n" << evaluateFileDoubleValuedInclude(a,b) << "\n \n";
std::cout << "Evaluating the text file containing complex-valued functions via the #include method: \n" << evaluateFileComplexValuedInclude(a,b) << "\n \n";
readInFromFile(&file,&parser);
std::cout<< "Evaluating either the double-valued or the complex-valued file [see the necessary changes tagged with (%)]:\n" << evaluateFile(&file,a,b) << "\n";
return 0;
}
xyz.txt
a + b * my_function(a,b)
xyzC.txt
2.0*Complex(a,b) + 3.0*a
To get the MWE to work, put the exprtk.hpp file in the same folder where you compile.
Note that the return type of the evaluateFile(...) function can be/is std::complex<double>, even though only double-valued types are returned. Lines tagged with // (%) are subject to change when trying out the complex-valued file xyzC.txt.
Instantiating exprtk::parser<std::complex<double> > throws (among others)
./exprtk.hpp:1587:10: error: no matching function for call to 'abs_impl'
exprtk_define_unary_function(abs )
while all other needed types seem to not complain about the type std::complex<double>.

I actually know next to nothing about ExprTk (just what I just read in its documentation and a bit of its code -- EDIT: now somewhat more of its code), but it seems to me unlikely that you'll be able to accomplish what you want without doing some major surgery to the package.
The basis of its API, as you demonstrate in your question, are template objects specialised on a single datatype. The documentation says that that type "…can be any floating point type. This includes… any custom type conforming to an interface comptaible (sic) with the standard floating point type." Unfortunately, it doesn't clarify what they consider the interface of the standard floating point type to be; if it includes every standard library function which could take a floating point argument, it's a very big interface indeed. However, the distribution includes the adaptor used to create a compatible interface for the MPFR package which gives some kind of idea what is necessary.
But the issue here is that I suspect you don't want an evaluator which can only handle complex numbers. It seems to me that you want to be able to work with both real and complex numbers. For example, there are expressions which are unambiguous for real numbers and somewhat arbitrary for complex numbers; these include a < b and max(a, b), neither of which are implemented for complex types in C++. However, they are quite commonly used in real-valued expressions, so just eliminating them from the evaluation language would seem a bit arbitrary. [Note 1] ExprTK does assume that the numeric type is ordered for some purposes, including its incorrect (imho) delta equality operator, and those comparisons are responsible for a lot of the error messages which you are receiving.
The ExprTK header does correctly figure out that std::complex<double> is a complex type, and tags it as such. However, no implementation is provided for any standard function called on complex numbers, even though C++ includes implementations for many of them. This absence is basically what triggers the rest of the errors, including the one you mention for abs, which is an example of a math function which C++ does implement for complex numbers. [Note 2]
That doesn't mean that there is no solution; just that the solution is going to involve a certain amount of work. So a first step might be to fill in the implementations for complex types, as per the MPFR adaptor linked to above (although not everything goes through the _impl methods, as noted above with respect to ordered comparison operators).
One option would be to write your own "real or complex" datatype; in a simple implementation, you could use a discriminated union like:
template<typename T>
class RealOrComplex {
public:
RealOrComplex(T a = 0)
: is_complex_(false), value_.real(a) {}
RealOrComplex(std::complex<T> a)
: is_complex_(true), value_.cplx(a) {}
// Operator implementations, omitted
private:
bool is_complex_;
union {
T real;
std::complex<T> cmplx;
} value_;
};
A possible simpler but more problematic approach would be to let a real number simply be a complex number whose imaginary part is 0 and then write shims for any missing standard library math functions. But you might well need to shim a large part of the Loop library as well.
So while all that is doable, actually doing it is, I'm afraid, too much work for an SO answer.
Since there is some indication in the ExprTK source that the author is aware of the existence of complex numbers, you might want to contact them directly to ask about the possibility of a future implementation.
Notes
It seems that Mathematica throws an error, if these operations are attempted on complex arguments. On the other hand, MatLab makes an arbitrary choice: ordered comparison looks only at the real part, but maximum is (inconsistently) handled by converting to polar coordinates and then comparing component-wise.
Forcing standard interfaces to be explicitly configured seems to me to be a curious implementation choice. Surely it would have been better to allow standard functions to be the default implementation, which would also avoid unnecessary shims like the explicit reimplementation of log1p and other standard math functions. (Perhaps these correspond to known inadequacies in certain math libraries, though.)

... can anyone here suggest a different math parser that is user friendly and can deal with complex numbers of the type std::complex(double), at the same time allowing to define custom functions that themselves return such complex values?
I came across only one math parser that handled complex numbers, Foreval:
https://sourceforge.net/projects/foreval/
Implementation as dll library.
You can connect real and complex variables of the "double" and "extended" type to Foreval in any form, passing the addresses of the variables.
Has built-in standard functions with complex variables. Also, you can connect external functions with complex variables.
But only the type of complex variables, passed and returned in functions is own, internal and will be different from std::complex(double).
There are examples for GCC in the source.
There are two disadvantages:
There is only a 32-bit version of Foreval.dll (connection is possible only for 32-bit program).
And only for OS Windows.
But there are also advantages:
Foreval.dll is a compiler, that generates machine code (fast calculations).
There is a real type with floating point - "extended" ("double" is too), also for complex numbers.

Why does bool exist when we can use int?

This may sound as a really dumb question. But it has been bothering me for the past few days. And It's not only concerning the C++ Programming Language as I've added it's tag. My Question is that. In Computer Science Boolean (bool) datatype has only two possible values. 'true' or 'false'. And also, in Computer Science, 1 is true and 0 is false. So why does boolean exists at all? Why not we use an integer that can return only two possible values, Such as 1 or 0.
For example :
bool mindExplosion = true; // true!
int mindExplosion = 1; // true!!
// or we can '#define true 1' and it's the same right?
What am I missing?

Why does bool exist when we can use int?
Well, you don't need something as large as an int to represent two states so it makes sense to allow for a smaller type to save space
Why not we use an integer that can return only two possible values, Such as 1 or 0.
That is exactly what bool is. It is an unsigned integer type that represents true (1) or false (0).
Another nice thing about having a specific type for this is that it express intent without any need for documentation. If we had a function like (warning, very contrived example)
void output(T const & val, bool log)
It is easy to see that log is an option and if we pass false it wont log. If it were instead
void output(T const & val, int log)
Then we aren't sure what it does. Is it asking for a log level? A flag on whether to log or not? Something else?

What am I missing?
Expressiveness.
When a variable is declared int it might be used only for 0 and 1, or it might hold anything from INT_MIN..INT_MAX.
When a variable is declared bool, it is made explicit that it is to hold a true / false value.
Among other things, this allows the compiler to toss warnings when an int is used in places where you really want a bool, or attempt to store a 2 in a bool. The compiler is your friend; give it all the hints possible so it can tell you when your code starts looking funky.

Encoding tagged unions (sum types) in LLVM

I'm trying to encode a tagged union (also known as a sum type) in LLVM and it doesn't seem possible while keeping the compiler frontend platform agnostic. Imagine I had this tagged union (expressed in Rust):
enum T {
C1(i32, i64),
C2(i64)
}
To encode this in LLVM I need to know know the size of the largest variant. That in turn requires that I know the alignment and size of all fields. In other words, my frontend would need to
track the size and alignment of all things,
create a dummy struct (properly padded) to represent the biggest type that can fit any variant (e.g. {[2 x i64]}, assuming the tag can fit in the same word as the i32 field),
and finally either used packed structs or tell LLVM which "data layout" I assumed, so my computations matches LLVMs
What is the best way currently to encode tagged unions in LLVM?

Conceptually, I don't think there's a better way than what you described, except that I wouldn't bother with using the constructed type at the declaration site, since actually accessing the union would be easiest to do through a bitcast anyway.
Here's a code snippet from Clang's getTypeExpansion(), showing it also does this - manually finding the largest field:
const FieldDecl *LargestFD = nullptr;
CharUnits UnionSize = CharUnits::Zero();
for (const auto *FD : RD->fields()) {
// Skip zero length bitfields.
if (FD->isBitField() && FD->getBitWidthValue(Context) == 0)
continue;
assert(!FD->isBitField() &&
"Cannot expand structure with bit-field members.");
CharUnits FieldSize = Context.getTypeSizeInChars(FD->getType());
if (UnionSize < FieldSize) {
UnionSize = FieldSize;
LargestFD = FD;
}
}
if (LargestFD)
Fields.push_back(LargestFD);

Casting a Z3 integer expression to a C/C++ int

I'm new to Z3 and searched for the answer to my question here and on Google. Unfortunately, I was not successful.
I'm using the Z3 4.0 C/C++ API. I declared an undefined function d: (Int Int) Int, added some assertions, and computed a model. So far, that works fine.
Now, I want to extract certain values of the function d defined by the model, say d(0,0). The following statement works, but returns an expression rather than the function value, i.e., an integer, of d(0,0).
z3::expr args[] = {c.int_val(0), c.int_val(0)};
z3::expr result = m.eval(d(2, args));
The check
result.is_int();
returns true.
My (hopefully not too stupid) question is how to cast the returned expression to a C/C++ int?
Help is very appreciated. Thank you!

Z3_get_numeral_int is what you're looking for.
Here is an excerpt from the docummentation:
Z3_bool Z3_get_numeral_int(__in Z3_context c, __in Z3_ast v, __out int * i)
Similar to Z3_get_numeral_string, but only succeeds if the value can fit in a
machine int. Return Z3_TRUE if the call succeeded.
You should be careful though. Z3's integer is mathematical integer which can easily exceed the range of 32-bit int. In that sense, using Z3_get_numeral_string and parsing string to big integer is a better option.

Strange C++ boolean casting behaviour (true!=true)

Just read on an internal university thread:
#include <iostream>
using namespace std;
union zt
{
bool b;
int i;
};
int main()
{
zt w;
bool a,b;
a=1;
b=2;
cerr<<(bool)2<<static_cast<bool>(2)<<endl; //11
cerr<<a<<b<<(a==b)<<endl; //111
w.i=2;
int q=w.b;
cerr<<(bool)q<<q<<w.b<<((bool)((int)w.b))<<w.i<<(w.b==a)<<endl; //122220
cerr<<((w.b==a)?'T':'F')<<endl; //F
}
So a,b and w.b are all declared as bool. a is assigned 1, b is assigned 2, and the internal representation of w.b is changed to 2 (using a union).
This way all of a,b and w.b will be true, but a and w.b won't be equal, so this might mean that the universe is broken (true!=true)
I know this problem is more theoretical than practical (a sake programmer doesn't want to change the internal representation of a bool), but here are the questions:
Is this okay? (this was tested with g++ 4.3.3) I mean, should the compiler be aware that during boolean comparison any non-zero value might mean true?
Do you know any case where this corner case might become a real issue? (For example while loading binary data from a stream)
EDIT:
Three things:
bool and int have different sizes, that's okay. But what if I use char instead of int. Or when sizeof(bool)==sizeof(int)?
Please give answer to the two questions I asked if possible. I'm actually interested in answers to the second questions too, because in my honest opinion, in embedded systems (which might be 8bit systems) this might be a real problem (or not).
New question: Is this really undefined behavior? If yes, why? If not, why? Aren't there any assumptions on the boolean comparison operators in the specs?

If you read a member of a union that is a different member than the last member which was written then you get undefined behaviour. Writing an int member and then reading the union's bool member could cause anything to happen at any subsequent point in the program.
The only exception is where the unions is a union of structs and all the structs contain a common initial sequence, in which case the common sequence may be read.

Is this okay? (this was tested with g++ 4.3.3) I mean, should the compiler be aware that during boolean comparison any non-zero value might mean true?
Any integer value that is non zero (or pointer that is non NULL) represents true.
But when comparing integers and bool the bool is converted to int before comparison.
Do you know any case where this corner case might become a real issue? (For example while binary loading of data from a stream)
It is always a real issue.
Is this okay?
I don't know whether the specs specify anything about this. A compiler might always create a code like this: ((a!=0) && (b!=0)) || ((a==0) && (b==0)) when comparing two booleans, although this might decrease performance.
In my opinion this is not a bug, but an undefined behaviour. Although I think that every implementor should tell the users how boolean comparisons are made in their implementation.
If we go by your last code sample both a and b are bool and set to true by assigning 1 and 2 respectfully (Noe the 1 and 2 disappear they are now just true).
So breaking down your expression:
a!=0 // true (a converted to 1 because of auto-type conversion)
b!=0 // true (b converted to 1 because of auto-type conversion)
((a!=0) && (b!=0)) => (true && true) // true ( no conversion done)
a==0 // false (a converted to 1 because of auto-type conversion)
b==0 // false (b converted to 1 because of auto-type conversion)
((a==0) && (b==0)) => (false && false) // false ( no conversion done)
((a!=0) && (b!=0)) || ((a==0) && (b==0)) => (true || false) => true
So I would always expect the above expression to be well defined and always true.
But I am not sure how this applies to your original question. When assigning an integer to a bool the integer is converted to bool (as described several times). The actual representation of true is not defined by the standard and could be any bit pattern that fits in an bool (You may not assume any particular bit pattern).
When comparing the bool to int the bool is converted into an int first then compared.
Any real-world case
The only thing that pops in my mind, if someone reads binary data from a file into a struct, that have bool members. The problem might rise, if the file was made with an other program that has written 2 instead of 1 into the place of the bool (maybe because it was written in another programming language).
But this might mean bad programming practice.
Writing data in a binary format is non portable without knowledge.
There are problems with the size of each object.
There are problems with representation:
Integers (have endianess)
Float (Representation undefined ((usually depends on the underlying hardware))
Bool (Binary representation is undefined by the standard)
Struct (Padding between members may differ)
With all these you need to know the underlying hardware and the compiler. Different compilers or different versions of the compiler or even a compiler with different optimization flags may have different behaviors for all the above.
The problem with Union
struct X
{
int a;
bool b;
};
As people mention writing to 'a' and then reading from 'b' is undefined.
Why: because we do not know how 'a' or 'b' is represented on this hardware. Writing to 'a' will fill out the bits in 'a' but how does that reflect on the bits in 'b'. If your system used 1 byte bool and 4 byte int with lowest byte in low memory highest byte in the high memory then writing 1 to 'a' will put 1 in 'b'. But then how does your implementation represent a bool? Is true represented by 1 or 255? What happens if you put a 1 in 'b' and for all other uses of true it is using 255?
So unless you understand both your hardware and your compiler the behavior will be unexpected.
Thus these uses are undefined but not disallowed by the standard. The reason they are allowed is that you may have done the research and found that on your system with this particular compiler you can do some freeky optimization by making these assumptions. But be warned any changes in the assumptions will break your code.
Also when comparing two types the compiler will do some auto-conversions before comparison, remember the two types are converted into the same type before comparison. For comparison between integers and bool the bool is converted into an integer and then compared against the other integer (the conversion converts false to 0 and true to 1). If the objects being converted are both bool then no conversion is required and the comparison is done using boolean logic.

Normally, when assigning an arbitrary value to a bool the compiler will convert it for you:
int x = 5;
bool z = x; // automatic conversion here
The equivalent code generated by the compiler will look more like:
bool z = (x != 0) ? true : false;
However, the compiler will only do this conversion once. It would be unreasonable for it to assume that any nonzero bit pattern in a bool variable is equivalent to true, especially for doing logical operations like and. The resulting assembly code would be unwieldy.
Suffice to say that if you're using union data structures, you know what you're doing and you have the ability to confuse the compiler.

The boolean is one byte, and the integer is four bytes. When you assign 2 to the integer, the fourth byte has a value of 2, but the first byte has a value of 0. If you read the boolean out of the union, it's going to grab the first byte.
Edit: D'oh. As Oleg Zhylin points out, this only applies to a big-endian CPU. Thanks for the correction.

I believe what you're doing is called type punning:
http://en.wikipedia.org/wiki/Type_punning

Hmm strange, I am getting different output from codepad:
11
111
122222
T
The code also seems right to me, maybe it's a compiler bug?
See here

Just to write down my points of view:
Is this okay?
I don't know whether the specs specify anything about this. A compiler might always create a code like this: ((a!=0) && (b!=0)) || ((a==0) && (b==0)) when comparing two booleans, although this might decrease performance.
In my opinion this is not a bug, but an undefined behaviour. Although I think that every implementor should tell the users how boolean comparisons are made in their implementation.
Any real-world case
The only thing that pops in my mind, if someone reads binary data from a file into a struct, that have bool members. The problem might rise, if the file was made with an other program that has written 2 instead of 1 into the place of the bool (maybe because it was written in another programming language).
But this might mean bad programming practice.
One more: in embedded systems this bug might be a bigger problem, than on a "normal" system, because the programmers usually do more "bit-magic" to get the job done.

Addressing the questions posed, I think the behavior is ok and shouldn't be a problem in real world. As we don't have ^^ in C++ I would suggest !bool == !bool as a safe bool comparison technique.
This way every non-zero value in bool variable will be converted to zero and every zero is converted to some non-zero value, but most probably one and the same for any negation operation.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Differentiate between float and double in LLVM IR - llvm

If you know the size of each type then you can use a substitute function for sizeof() of the C Language as described in this link: http://nondot.org/sabre/LLVMNotes/SizeOf-OffsetOf-VariableSizedStructs.txt

Related

Math Parser for Complex Numbers in C (ExprTk)

Why does bool exist when we can use int?

Encoding tagged unions (sum types) in LLVM

Casting a Z3 integer expression to a C/C++ int

Strange C++ boolean casting behaviour (true!=true)

Categories

Resources