Reduce C++ source by variable substitution and dead code truncation? - c++

Suppose I have some template classes with Nontype parameters.
template <int hi, int wid>
class SomeThing {
...
}
I need to create a tool to reduce this source with given value of hi and wid, say, hi=2; wid=3. Sequentially, there might be some code becomes dead code, and the tool also needs to truncate them away. So, finally I expect to see a reduced source code as the output of the tool. Is there any known way to do this? A harder way may be to create my own c++ parser... sounds terrible even a simplified one.
I know there are tools like gcc-xml and clang which can parse it and give an easy-to-parse intermediate file. However, it looks like that it's not enough for me to regenerate c++ source file from that.
[EDIT]
A whole picture is to create a tool to generate source code from source code, with variable substitution and dead code truncation.

I'm not sure I quite understood your question, but would template specialization answers your needs?
template<>
class SomeThing<2, 3> {
//trimmed content
};
If you instantiate SomeThing with the value 2 and 3, the specialization will be chosen by the compiler and the generated executable will contain only the "truncated" content.
Edit:
Based on your edit, I suspect that you'd like to have a partial evaluator for C++, meaning a program which takes a program and some of its inputs, and generates a specialized version of the program where all that could be evaluated had been evaluated.
I'm not aware of any existing implementations for native C++; however, you can find partial evaluators for many functional languages, but also for Pascal and C. Some works have been done to create a partial evaluator for the .Net bytecode (MSIL), which could be used to partially evaluate C++/CLI. [Chepovsky et al. 2003]
The C++ template mechanism can be seen as a limited kind of partial evaluation, since the compiler generates code specialized (and potentially partially evaluated) with the template parameters. However, all this is performed by the compiler internally, there is no intermediary C++ source code that you could visualize. However, you can have a look at the generated assembly code, which gives you a good idea of the operations/evaluations/optimizations performed by the compiler during template instantiation.

There's no clean way to do this, as template code is generally Turing-complete.
As a very simple example, consider
template<int I>
class X : public X<I/2>
{
};
Now say that you want to reduce this for I==351. What exactly should be the base classes be? For real-world code you will need a full C++ compiler. Worse, you will also need a matching standard library implementation, and one that is fully representative of all compliant standard library implementations (!!)
Consider the following code:
template <int I>
class X : public std::vector<X<I/2> >
{
// Methods
};
Dead code elimination will depend on the implementation of std::vector. If your implementations differ, you can accidentily eliminate code that is in fact needed.

Related

How to know the required interface/contract of template arguments in C++?

Sorry for the newbie question, but I have a feeling I am missing something here:
If I have a certain class template which looks like this (basically the only way to pass a lambda to a function in C++, unless I am mistaken):
template<typename V, typename F>
class Something
{
public:
int some_method(V val, F func) {
double intermediate = val.do_something();
return func(intermediate);
}
}
By reading the implementation of this class, I can see that the V class must implement double do_something(), and that F must be a function/functor with the signature int F(double).
However, in languages like Java or C#, the constraints for the generic parameters are explicitly stated in the generic class signature, so they are obvious without having to look at the source code, e.g.
class Something<V> where V : IDoesSomething // interface with the DoSomething() method
{
// func delegate signature is explicit
public int SomeMethod(V val, Func<double, int> func)
{
double intermediate = val.DoSomething();
return func(intermediate);
}
}
My question is: how do I know how to implement more complex input arguments in practice? Can this somehow be documented using code only, when writing a library with template classes in C++, or is the only way to parse the code manually and look for parameter usage?
(or third possibility, add methods to the class until the compiler stops failing)
C# and Java Generics have similar syntax and some common uses with C++ templates, but they are very different beasts.
Here is a good overview.
In C++, by default template checking was done by instantiation of code, and requrements are in documentation.
Note that much of the requirements of C++ templates is semantic not syntactic; iterators need not only have the proper operations, those operations need to have the proper meaning.
You can check syntactic properties of types in C++ templates. Off the top of my head, there are 6 basic ways.
You can have a traits class requirement, like std::iterator_traits.
You can do SFINAE, an accidentally Turing-complete template metaprogramming technique.
You can use concepts and/or requires clauses if your compiler is modern enough.
You can generate static_asserts to check properties
You can use traits ADL functions, like begin.
You can just duck type, and use it as if it had the properties you want. If it quacks like a duck, it is a duck.
All of these have pluses and minuses.
The downside to all of them is that they can be harder to set up than "this parameter must inherit from the type Foo". Concepts can handle that, only a bit more verbose than Java.
Java style type erasure can be dominated using C++ templates. std::function is an example of a duck typed type eraser that allows unrelated types to be stored as values; doing something as restricted as Java is rarely worthwhile, as the machinery to do it is harder than making something more powerful.
C# reification cannot be fully duplicated by C++, because C#'s runtime environment ships with a compiler, and can effectively compile a new type when you instantiate at runtime. I have seen people ship compilers with C++ programs, compile dynamic libraries, then load and execute them, but that isn't something I'd advise.
Using modern c++20, you can:
template<Number N>
struct Polynomial;
where Number is a concept which checks N (a type) against its properties. This is a bit like the Java signature stuff on steroids.
And by c++23 you'll be able to use compile time reflection to do things that make templates look like preprocessor macros.

C++ How do compilers handle templates [duplicate]

This question already has answers here:
How does the compilation of templates work?
(7 answers)
Closed last year.
As some of you may know from my recent posts i am studying for a C++ exam which the content for the class was delivered very poorly. I am basically having to self teach everything myself so bear with me here.
This is an exam question:
(i)Explain the concepts of templates as defined in the C++ language.
Be sure to differentiate between what the programmer does and what the
compiler does.
My current rationale:
(i) A template allows a function or class to operate using generics. This allows the programmer to effective program X functionality once, and be able to use this functionality with many different data types without having to rewrite the application or parts of the application multiple times.
My problem is i have no idea how the compiler handles the use of templates.
I am unsure what the compiler does at this stage, if somebody could clear this up it would be helpful.
Templates in C++ are implemented through substitution. It's not like Java generics which just type check the code which involves the generics class and then compiles it using raw references (type erasure).
Basically C++ creates a different class/method for each actual template argument used in your code. If you have your
template<typename T>
void myMethod(T t)
{
//
}
what happens at compile time is that a different method is compiled for each type the template is actually used. If you use it on myMethod(50) and myMethod("foo") then two overloaded version of the method will be available at runtime. Intuitively this means that templates could generate code bloating but in practice the same expressiveness is obtained by a larger codebase without templates with less readability so that's not a real concern.
So there is no black magic behind them (ok there is if you consider meta programming or partial specialization).
let's say you write a function using templates:
template <typename T>
void function(T t){
doSomething();
}
for each data type you call this function, the compiler simply replaces the 'T' with that data type, say 'int' and generates code for that like you've written this function with 'int' instead of 'T' since the beginning.
This is probably the right (but not the complete) answer if others agreed.
For each instance of an object of a different type that you create or in case of functions the different type of arguments that you use, the compiler simply makes an overloaded version at compile time. So if you have a template function like a sort function and use that function for int and double arrays, then the compiler have actually made two functions: one using int and the other using double. This is the simplest explanation I could give. Hope it's useful.

C++ should I use templates, I'm about to create a lexer, why should it be limited chars? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 9 years ago.
Improve this question
I'm about to create a lexer for a project, proof of concepts of it exists, the idea works and whatnot, I was about to start writing it and I realised:
Why chars?
(I'm moving away from C, I'm still fairly suspicious of the standard libraries, I felt it easier to deal in char* for offsets and such than learn about strings)
why not w_char or something, ints, or indeed any type (given it has some defined operations).
So should I use a template? So far it seems like yes I should but there are 2 counter-arguments I can consider:
Firstly, modular complication, the moment I write "template" it must go in a header file / be available with implementation to whatever uses it (it's not a matter of hiding source code I don't mind the having to show code part, it will be free (as in freedom) software) this means extra parsing and things like that.
My C background screams not to do this, I seem to want separate .o files, I understand why I can't by the way, I'm not asking for a way.
Separate object files speed up complication, because the make file (you tell it or have it use -MM with the compiler to figure out for itself) wont run the complication command for things that haven't changed and so forth.
Secondly, with templates, I know of no way to specify what a type must do, other than have the user realise when something fails (you know how Java has an extends keyword) I suspect that C++11 builds on this, as meta-programming is a large chapter in "The C++ programming language" 4th edition.
Are these reasons important these days? I learned with the following burned into my mind:
"You are not creating one huge code file that gets compiled, you create little ones that are linked" and templates seem to go against this.
I'm sure G++ parses very quickly, but it also optimises, if it spends a lot of time optimising one file, it'll re-do that optimisation every time it sees that in a translation unit, where as with separate object files, it does a bit (general optimisations) only once, and perhaps a bit more if you use LTO (link time optimisation)
Or I could create a class that every input to the lexer derives from and use that (generic programming I believe it's called) but my C-roots say "eww virtuals" and urge me towards the char*
I understand this is quite open, I just don't know where to draw the line between using a template, and not using a template.
Templates don't have to be in the header! If you have only a few instantiations, you can explicitly instantiate the class and function templates in suitable translation units. That is, a template would be split into three parts:
A header declaring the templates.
A header including the first and implementing the template but otherwise only included in the third set of files.
Source files including the headers in 2. and explicitly instantiating the templates with the corresponding types.
Users of these template would only include the header and never the implementation header. An example where this can be done are IOStreams: There are basically just two instantiations: one for char and one for wchar_t. Yes, you can instantiate the streams for other types but I doubt that anybody would do so (I'm sometimes questioning if anybody uses stream with a different character type than char but probably people are).
That said, the concepts used by templates are, indeed, not explicitly represented in the source and C++11 doesn't add any facilities to do so either. There were discussions on adding concepts to C++ but so far they are not part of any standard. There is a concepts light proposal which, I think, will be included in C++14.
However, in practice I haven't found that much of a problem: it is quite possible to document the concepts and use things like static_assert() to potentially produce nicer error messages. The problem is more that many concepts are actually more restrictive than the underlying algorithms and that the extra slack is sometimes quite useful.
Here is a brief and somewhat made-up example of how to implement and instantiate the template. The idea is to implement something like std::basic_ostream but merely provide out scaled-down version of a string output operator:
// simple-ostream.hpp
#include "simple-streambuf.hpp"
template <typename CT>
class simple_ostream {
simple_streambuf<CT>* d_sbuf;
public:
simple_ostream(simple_streambuf<CT>* sbuf);
simple_streambuf<CT>* rdbuf() { return this->d_sbuf; } // should be inline
};
template <typename CT>
simple_ostream<CT>& operator<< (simple_ostream<CT>&, CT const*);
Except for the rdbuf() member the above is merely a class definition with a few member declarations and a function declaration. The rdbuf() function is implemented directly to show that you can mix&match the visible implementation where performance is necessary with external implementation where decoupling is more important. The used class template simple_streambuf is thought to be similar to std::basic_streambuf and, at least, declared in the header "simple-streambuf.hpp".
// simple-ostream.tpp
// the implementation, only included to create explicit instantiations
#include "simple-ostream.hpp"
template <typename CT>
simple_ostream<CT>::simple_ostream(simple_streambuf<CT>* sbuf): d_sbuf(sbuf) {}
template <typename CT>
simple_ostream<CT>& operator<< (simple_ostream<CT>& out, CT const* str) {
for (; *str; ++str) {
out.rdbuf()->sputc(*str);
}
return out;
}
This implementation header is only included when explicitly instantiating the class and function templates. For example, to instantiations for char would look like this:
// simple-ostream-char.cpp
#include "simple-ostream.tpp"
// instantiate all class members for simple_ostream<char>:
template class simple_ostream<char>;
// instantiate the free-standing operator
template simple_ostream<char>& operator<< <char>(simple_ostream<char>&, char const*);
Any use of the simple_ostream<CT> would just include simple-ostream.hpp. For example:
// use-simple-ostream.cpp
#include "simple-ostream.hpp"
int main()
{
simple_streambuf<char> sbuf;
simple_ostream<char> out(&sbuf);
out << "hello, world\n";
}
Of course, to build an executable you will need both use-simple-ostream.o and simple-ostream-char.o but assuming the template instantiations are part of a library this isn't really adding any complexity. The only real headache is when a user wants to use the class template with unexpected instantiations, say, char16_t, but only char and wchar_t are provided: In this case the user would need to explicitly create the instantiations or, if necessary, include the implementation header.
In case you want to try the example out, below is a somewhat simple-minded and sloppy (because being header-only) implementation of simple-streambuf<CT>:
#ifndef INCLUDED_SIMPLE_STREAMBUF
#define INCLUDED_SIMPLE_STREAMBUF
#include <iostream>
template <typename CT> struct stream;
template <>
struct stream<char> {
static std::ostream& get() { return std::cout; }
};
template <>
struct stream<wchar_t> {
static std::wostream& get() { return std::wcout; }
};
template <typename CT>
struct simple_streambuf
{
void sputc(CT c) {
stream<CT>::get().rdbuf()->sputc(c);
}
};
#endif
Yes, it should be limited to chars. Why ? Because you're asking...
I have little experience with templates, but when I used templates the necessity arose naturally, I didn't need to try to use templates.
My 2 cents, FWIW.
1: Firstly, modular complication, the moment I write "template" it must go in a header file…
That's not a real argument. You have the ability to use C, C++, structs, classes, templates, classes with virtual functions, and all the other benefits of a multi paradigm language. You're not coerced to take an all-or-nothing approach with your designs, and you can mix and match these functionalities based on your design's needs. So you can use templates where they are an appropriate tool, and other constructs where templates are not ideal. It's hard to know when that will be, until after you have had experience using them all. Template/header-only libraries are popular, but one of the reasons the approach is used is that they simplify linking and the build process, and can reduce dependencies if designed well. If they are designed poorly, then yes, they can result in an explosion in compile times. That's not the language's fault -- it's the implementor's design.
Heck, you could even put your implementations behind C opaque types and use templates for everything, keeping the core template code visible to exactly one translation.
2: Secondly, with templates, I know of no way to specify what a type must do…
That is generally regarded as a feature. Instantiation can result in further instantiations which is capable of instantiating different implementations and specializations -- this is template meta programming domain. Often, all you really need to do is instantiate the implementation, which results in evaluation of the type and parameters. This -- simulation of "concepts" and interface verification -- can increase your build times, however. But furthermore, that may not be the best design because deferring instantiation is in many cases preferable.
If you just need to brute-force instantiate all your variants, one approach would be to create a separate translation which does just that -- you don't even need to link it to your library; add it to a unit test or some separate target. That way, you could validate instantiation and functionalities are correct without significant impact to your clients including/linking to the library.
Are these reasons important these days?
No. Build times are of course very important, but I think you just need to learn the right tool to use, and when and why some implementations must be abstracted (or put behind compilation firewalls) when/if you need fast builds and scalability for large projects. So yes, they are important, but a good design can strike a good balance between versatility and build times. Also remember that template metaprogramming is capable of moving a significant amount of program validation from runtime to compile time. So a hit on compile times does not have to be bad, because it can save you from a lot of runtime validations/issues.
I'm sure G++ parses very quickly, but it also optimises, if it spends a lot of time optimising one file, it'll re-do that optimisation every time it sees that in a translation unit…
Right; That redundancy can kill fast build times.
where as with separate object files, it does a bit (general optimisations) only once, and perhaps a bit more if you use LTO (link time optimisation) … Separate object files speed up complication, because the make file (you tell it or have it use -MM with the compiler to figure out for itself) wont run the complication command for things that haven't changed and so forth.
Not necessarily so. First, many object files produce a lot of demand on the linker. Second, it multiplies the work because you have more translations, so reducing object files is a good thing. This really depends on the structure of your libraries and dependencies. Some teams take the approach the opposite direction (I do quite regularly), and use an approach which produces few object files. This can make your builds many times faster with complex projects because you eliminate redundant work for the compiler and linker. For best results, you need a good understanding of the process and your dependencies. In large projects, translation/object reductions can result in builds which are many times faster. This is often referred to as a "Unity Build". Large Scale C++ Design by John Lakos is a great read on dependencies and C++ project structures, although it's rather dated at this point so you should not take every bit of advice at face value.
So the short answer is: Use the best tool for the problem at hand -- a good designer will use many available tools. You're far from exhausting the capabilities of the tools and build systems. A good understanding of these subjects will take years.

Synthesizing the interface of T required by a class template

template <typename T>
class A
{
// use the type parameter T in various ways here
}
Is there any way to automatically synthesize a workable class definition for T, as used by the template A? My expectation is a tool or a compiler trick that could generate the boiler plate code for the type parameter T, which I could tweak further to my needs.
I know if I wrote the class A, I could provide some hints to the "user" using boost concepts checks etc... But it's an unfamiliar code base where I didn't have the luxury of writing class A. So far I build the needed parameter class T manually, by reading the code for class A and with the able assistance from the compiler (with its terse messages).
Is there a better way?
If I understand you correctly, you are looking for a way to automatically generate a concept archetype for a given template class. Currently, this is not possible and maybe it never will be.
The main problem here is that it is very hard to say anything about the semantics of As code without any a priori knowledge of T. Dave Abrahams wrote a blog post not too long ago, where he showed that it is possible to to call an unconstrained function from code that is constrained by concepts and the compiler will still be able to perform the concept checks correctly.
But what you are asking for is a compiler that synthesizes concept checks out of thin air. I'm not much of a compiler person but I can't think of a way to make this possible with today's tools. Although it surely would be very cool if this became possible some day.

reuse function logic in a const expression

I think my question is, is there anyway to emulate the behaviour that we'll gain from C++0x's constexpr keyword with the current C++ standard (that is if I understand what constexpr is supposed to do correctly).
To be more clear, there are times when it is useful to calculate a value at compile time but it is also useful to be able to calculate it at runtime too, for e.g. if we want to calculate powers, we could use the code below.
template<int X, unsigned int Y>
struct xPowerY_const {
static const int value = X*xPowerY_const<X,Y-1>::value;
};
template<int X>
struct xPowerY_const<X, 1> {
static const int value = X;
};
int xPowerY(int x, unsigned int y) {
return (y==1) ? x : x*xPowerY(x,y-1);
}
This is a simple example but in more complicated cases being able to reuse the code would be helpful. Even if, for runtime performance, the recursive nature of the function is suboptimal and a better algorithm could be devised it would be useful for testing the logic if the templated version could be expressed in a function, as I can't see a reasonable method of testing the validity of the constant template method in a wide range of cases (although perhaps there is one and i just can't see it, and perhaps that's another question).
Thanks.
Edit
Forgot to mention, I don't want to #define
Edit2 Also my code above is wrong, it doesn't deal with x^0, but that doesn't affect the question.
Template metaprogramming implements logic in an entirely different (and incompatible) way from "normal" C++ code. You're not defining a function, you're defining a type. It just happens that the type has a value associated with it, which is built up from a combination of other types.
Because the templates define types, there is no program logic involved. The logic is simply a side effect of the compiler trying to resolve relationships between the templated types. There really isn't any way to automatically extract the high level logic from a template "program" into a function.
FWIW, template metaprogramming wasn't even a glimmer in Bjarne's eye when templates were first implemented. They were actually discovered later on in the language's life by users of the language. It's an "unintended" side-effect of the type system that just happened to become very popular. It's precisely because of this discovery that new features are being added to the language to more thoroughly support the idioms that have evolved.