Pattern matching style in C++?

Pattern matching style in C++? - c++

I love Haskell style pattern matching.
I have my C++ code as follows:
ObjectPtr ptr;
if(ptr.isType<Foo>()) { // isType returns a bool
Ptr<Foo> p = ptr.convertAs<Foo>(); // convertAs returns a Ptr<Foo>
......
}
if(ptr.isType<Bar>()) {
Ptr<Bar> p = ptr.convertAs<Bar>();
......
}
Now, are there any macros I can do define to simplify this? I have been pondering this for a while, but can't simplify it further.
Thanks!

dynamic_cast would appear to do what you want
struct A {
virtual ~A() {}
};
struct B : struct A { ... };
struct C : struct A { ... };
A * a = new C;
if ( C * c = dynamic_cast<C*>( a ) ) {
c->someCfunc();
}
else if ( B * b = dynamic_cast<B*>( a ) ) {
b->someBfunc();
}
else {
throw "Don't know that type";
}

I love Haskell style pattern matching.
Then write your program in Haskell.
What you're trying to do is a switch over a type. That's a common thing people do if they want to avoid virtual functions. Now, the latter are a cornerstone of what OO in C++ is all about. If you want to avoid them, why do you program in C++?
As for why this is frowned upon: Imagine you have a lot of code like this
if(ptr.isType<Foo>()) ...
if(ptr.isType<Bar>()) ...
smeared all over your code and then someone comes and adds Baz to the possible types that ptr might represent. Now you're hunting through a big code base, trying to find all those places where you switched over a type, and trying to find out which ones you need to add Baz to.
And, as Murphy has it, just when your done, there comes along Foz to be added as a type, too. (Or, thinking again, if Murphy has his way it creeps in before you had a chance too complete adding Baz.)

Attempting to simulate a pattern matching style in C++ using RTTI is a neat idea, but it's bound to have shortcomings, because there are some significant differences between Haskell and Standard ML style type constructors and C++ subclasses. (Note: below, I use Standard ML syntax because I'm more comfortable with it.)
In Haskell and Standard ML, pattern matching can bind nested values to pattern variables for you (e.g. the pattern a::b::c::ds binds the first three elements of the list to a, b, and c, and the rest of the list to ds). In C++, you'll still have to dig around in the actual nested structures, unless you or someone else comes up with far more complicated macros than have been proposed here.
In Haskell and Standard ML, a type constructor datatype declaration like datatype 'a option = NONE | SOME of 'a defines one new type: 'a option. Constructors NONE and SOME are not types, they are values with types 'a option and 'a -> 'a option, respectively. In C++, when you define subclasses like Foo and Bar to simulate type constructors, you get new types.
In Haskell and Standard ML, constructors like SOME are first-class functions that construct values of the datatype to which they belong. For example, map SOME has the type 'a list -> 'a option list. In C++, using subclasses to simulate type constructors, you don't get this ability.
In Haskell and Standard ML, datatypes are closed, so no one can add more type constructors without changing the original declaration, and the compiler can verify at compile time that the pattern match handles all cases. In C++, you have to go well out of your way to restrict who can subclass your base class.
In the end, are you getting enough benefit from simulated pattern matching compared to using C++ polymorphism in a more typical way? Is using macros to make simulated pattern matching slightly more concise (while obfuscating it for everyone else who reads your code) worthwhile?

We co-authored a pattern matching library for C++ that allows you to do pattern matching and type analysis very efficiently. The library, called Mach7, has been released under BSD license and is available on GitHub: https://github.com/solodon4/Mach7. You can find videos, posters, slides, papers as well as the source code there. It currently supports GCC 4.4+, Clang 3.4+ and Visual C++ 2010+. Feel free to ask question's about the library by submitting a GitHub issue against its repository.

I'm assuming that your Ptr template has the concept of a NULL pointer.
ObjectPtr ptr;
if(Ptr<Foo> p = ptr.convertAs<Foo>()) { // convertAs returns a NULL pointer if the conversion can't be done.
......
}
if(Ptr<Bar> p = ptr.convertAs<Bar>()) {
......
}
Though, as others have noted, switching on type is usually a sign you're doing something wrong in C++. You ought to consider using virtual functions instead.

A think this macro does precisely what you want:
#define DYN_IF(dest_type, dest_ptr, src_ptr) \
if((src_ptr).isType<dest_type>()) \
if(int dest_type##dest_ptr = 1) \
for(Ptr<dest_type> dest_ptr = (src_ptr).convertAs<dest_type>(); \
dest_type##dest_ptr; \
dest_type##dest_ptr=0)
Usage:
ObjectPtr ptr;
DYN_IF(Foo, foo_ptr, ptr) {
// foo_ptr is Ptr<Foo>
}
DYN_IF(Bar, bar_ptr, ptr) // Works without braces too for single statement
// bar_ptr is Ptr<Bar>
I wouldn't recommend this sort of stuff in code that is meant to be read by somebody else, but since you mentioned the word "macro"...
Also, I wouldn't pretend this has anything to do with pattern matching in the Haskell/OCaml style. Check Scala if you want a language that has semantics similar to C++ (well, sort of) and true pattern matching.

Related

Is there any programming language where you can forbid class casting of return types?

Background
Take this simple java class as an example:
class SomeWrapper {
List<SomeType> internalDataStructure = new ArrayList<>();
boolean hasSomeSpecificSuperImportantElement = false;
// additional fields that do useful things
public void add(SomeType element) {
internalDataStructure.add(element);
if (SUPER_IMPORTANT.equals(element)) {
hasSomeSpecificSuperImportantElement = true;
}
}
public Iterable<SomeType> getContents() {
return internalDataStructure;
}
// additional methods that are definitely useful
}
Although it looks somewhat okay it has a big flaw. Because internalDataStructure is returned as is, callers of getContents() could class cast it to List and modify the internal data structure in ways they are not supposed to.
class StupidUserCode {
void doSomethingStupid(SomeWrapper arg) {
Iterable<SomeType> contents = arg.getContents();
// this is perfectly fine
List<SomeType> asList = (List<SomeType>) contents;
asList.add(SUPER_IMPORTANT);
}
}
In java and in other, similar programming languages, this problem can be solved by wrapping the internal data structure in some immutable wrapper. For example:
public Iterable<SomeType> getContents() {
return Collections.unmodifiableList(internalDataStructure);
}
This works now, but has, in my humble opinion, a number of drawbacks:
tiny performance drawback, not a big deal in all but the most extreme circumstances
developers who are new to the language need to learn this immutable API
developers of the standard library need to keep adding immutable support for all kinds of data structures
The code becomes more verbose
The immutable wrapper has a number of public methods that all throw exceptions. This is a rather dirty solution in my opinion because it requires additional documentation for users of the API.
Question
Are there any programming languages where you can specify a return type of a method to be impossible to be class cast?
I was thinking of something like synthetic types where appending an exclamation mark ! to the end of a typename makes it un-class-cast-able. Like this:
public Iterable!<SomeType> getContents() {
return internalDataStructure;
}
void doSomethingStupid(SomeWrapper arg) {
// The ! here is necessary because Iterable is a different type than Iterable!
Iterable!<SomeType> contents = arg.getContents();
// this now becomes a compile-time error because Iterable! can not be cast to anything
List<SomeType> asList = (List<SomeType>) contents;
asList.add(SUPER_IMPORTANT);
}

Since the point here is about mutability, the Java example of List vs. Iterable isn't a great example of mutable vs. immutable data types (the Iterator.remove method mutates the underlying collection, so the List could be corrupted by the external caller even without casting).
Let's instead imagine two types, MutableList and ReadonlyList, where MutableList is the subtype, and a ReadonlyList only prevents the user from mutating it; the list itself is not guaranteed to avoid mutation. (We cannot sensibly name the supertype ImmutableList because no value is both a mutable and an immutable list.)
Casting from the supertype to the subtype, e.g. from ReadonlyList to MutableList, is called downcasting. Downcasting is unsafe, because not every value of the supertype is a value of the subtype; so either a check needs to be performed at runtime (as Java does, throwing ClassCastException if the instance being casted does not have the right type), or the program will do something memory-unsafe (as C might do).
In theory, a language might forbid downcasting on the grounds that it is unsafe; popular programming languages don't, because it's convenient to be able to write code which you know is type-safe, but the language's type system is not powerful enough for you to write suitable type annotations which allow the type-checker to prove that the code is type-safe. And no type-checker can reasonably be expected to prove every provable property of your code. Still, in theory there is nothing stopping a language from forbidding downcasts; I just don't think many people will choose to use such a language for developing large software.
That said, I think the solution to the problem you describe would be simply not to make MutableList a subtype of ReadonlyList. The MutableList class can still have a method to get a read-only view, but since there would be no subtype/supertype relation, that view would not be a value of type MutableList so it could not be cast to the mutable type, even if you upcast to a common supertype first.
To avoid the performance cost at runtime, it could be possible for a language to have specific support for such wrappers to allow the wrapper to delegate its methods to the original list at compile-time instead of at runtime.

Is it possible to work with AST inside D code?

I'm not talking about dynamic programming. My wish is to work in compile time with constructions like:
obj.where(x => x.some_val >= 14); // <-- LINQ-style :D
To have a possibility to work directly with AST of single-expression function-argument:
(>=)
|--(14)
+--(.)
|--(x)
+--(some_val)
Now I've got only the idea to use some special class for x-objects with all operators (like +/-/*/./...) strongly overridden in some crazy dirty way to collect the information about anonymous function AST structure (if and only if this class is the only class to use in this single-expression function).
Like tiny AST for single r-value.
Is it technically possible somehow?

If you want to generate code at compile time, then you can use strings with string mixins. e.g.
string foo(string name, int value)
{
return format("auto %s = %s;", name, value);
}
void bar()
{
mixin(foo("i", 42));
assert(i == 42);
}
That's not a particular interesting example, but as long as you can manipulate strings into the code you want, then you can mix them in, which allows for all kinds of code generation possibilities (both useful and abusive).
However, there is no way to actually manipulate the AST in D. As mentioned in Richard's answer as well as the comments, Walter is strongly against adding such capabilities to the language. So, it's highly unlikely that D will ever have them. But given how much you can do with string mixins, a lot of what someone might want to do with AST macros can be done with string mixins. They allow you to generate pretty much any code that you might want to. They just don't allow you to manipulate existing code.

Nope, and Walter has been fairly against it in the past e.g. AST macros.

Using dynamic typing in D, a statically typed language

I was implementing a dynamic typing library for D when I ran across an interesting problem.
Right now, I've succeeded in making a function called dynamic() which returns a dynamic version of an object.
For example:
import std.stdio, std.dynamic.core;
class Foo
{
string bar(string a) { return a ~ "OMG"; }
int opUnary(string s)() if (s == "-") { return 0; }
}
void main(string[] argv)
{
Dynamic d = dynamic(new Foo());
Dynamic result = d.bar("hi");
writeln(result); // Uh-oh
}
The problem I've run across is the fact that writeln tries to use compile-time reflection to figure out how to treat result.
What's the first thing it tries? isInputRange!(typeof(result))
The trouble is, it returns true! Why? Because I have to assume that all members which it needs exist, unless I can prove otherwise at run time -- which is too late. So the program tries to call front, popFront, and empty on result, crashing my program.
I can't think of a way to fix this. Does anyone have an idea?

You are trying to make two fundamentally different concepts work together, namely templates and dynamic typing. Templates rely very much on static typing, isInputRange works by checking which attributes or methods a type has. Your dynamic type is treated as having every attribute or method at compile time, ergo it is treated as fulfilling every static duck-typing interface.
Therefore, to make Dynamic work in a statically typed environment, you have to provide more static information at some places.
Some solutions I can see:
provide your own dynamically typed implementations for heavily used functions. The whole problem you are having is caused by the fact that you are trying to use generic functions that assume static typing with dynamic types.
explicitly make dynamic a range of char, and care for the conversion to string of the underlying data yourself. (You'd have to have a custom toString method anyways if the isInputRange issue would not exist, because otherwise its result would again be of Dynamic type). This would probably make writeln(d); work.
provide wrappers for dynamic that allow you to pass dynamic types into various templated functions. (Those would just exhibit a static interface and forward all calls to Dynamic).
Eg:
Dynamic d;
// wrap d to turn it into a compile-time input range (but NOT eg a forward range)
Dynamic d2=dynamic(map!q{a*2}(dynInputRange(d)));
// profit
4 . Add a member template to Dynamic, which allows to statically disable some member function names.
Eg:
static assert(!isForwardRange!(typeof(d.without!"save")));

what is wrong with using std.variant which implements all you need for dynamic typing (along with quite a bit of syntactic sugar)

Could you provide an overload for isInputRange? Something like this (note that I haven't looked at the implementation of isInputRange):
template isInputRange(T : Dynamic) {
enum isInputRange = false;
}
If this is provided by your dynamic.core, I think this overload should be chosen before the std lib one.

For the general case Dynamic has to accept any method lookup at compile time, as you said. Suppose for a moment that you could prevent the isInputRange predicate to evaluate to true, now the wrong code will be generated when you try to create a Dynamic from an input range.
I don't think this is fixable, at least not in a general way. In this particular case the best solution I can think of is that Dynamic provides it's own version of toString, and writeln would prefer that over the inputRange specialization. I believe writeln doesn't do this at the moment, at least not for structs, but it probably should.
Another compromise would be to disallow a few methods such as popFront in the opDispatch constraint, instead Dynamic would provide opIndex or a member object to access these special cases. This might not be as bad as it sounds, because the special cases are rare and using them would result in an obvious compiler error.
I think that the best way to salvage this kind of method resolution for Dynamic is to fix writeln and accept that Dynamic will not work with all templated code.

Have you looked into std.variant?
import std.stdio, std.variant;
class Foo {
string Bar(string a) {
return a ~ " are Cool!";
}
}
void main() {
Variant foo = new Foo();
Variant result = foo.peek!Foo.Bar("Variants");
writeln(result); // Variants are Cool!
}
http://www.d-programming-language.org/phobos/std_variant.html

C++ design pattern to get rid of if-then-else

I have the following piece of code:
if (book.type == A) do_something();
else if (book.type == B) do_something_else();
....
else do so_some_default_thing.
This code will need to be modified whenever there is a new book type
or when a book type is removed. I know that I can use enums and use a switch
statement. Is there a design pattern that removes this if-then-else?
What are the advantages of such a pattern over using a switch statement?

You could make a different class for each type of book. Each class could implement the same interface, and overload a method to perform the necessary class-specific logic.
I'm not saying that's necessarily better, but it is an option.

As others have pointed out, a virtual function should probably be your first choice.
If, for some reason, that doesn't make sense/work well for your design, another possibility would be to use an std::map using book.type as a key and a pointer to function (or functor, etc.) as the associated value, so you just lookup the action to take for a particular type (which is pretty much how many OO languages implement their equivalent of virtual functions, under the hood).

Each different type of book is a different sub-class of the parent class, and each class implements a method do_some_action() with the same interface. You invoke the method when you want the action to take place.

Yes, it's called looping:
struct BookType {
char type;
void *do();
};
BookType[] types = {{A, do_something}, {B, do_something_else}, ...};
for (int i = 0; i < types_length; i++) {
if (book.type == types[i].type) types[i].do(book);
}
For a better approach though, it's even more preferrable if do_something, do_something_else, etc is a method of Book, so:
struct Book {
virtual void do() = 0;
};
struct A {
void do() {
// ... do_something
}
};
struct B {
void do() {
// ... do_something_else
}
};
so you only need to do:
book.do();

Those if-then-else-if constructs are one of my most acute pet peeves. I find it difficult to conjure up a less imaginative design choice. But enough of that. On to what can be done about it.
I've used several design approaches depending on the exact nature of the action to be taken.
If the number of possibilities is small and future expansion is unlikely I may just use a switch statement. But I'm sure you didn't come all the way to SOF to hear something that boring.
If the action is the assignment of a value then a table-driven approach allows future growth without actually making code changes. Simply add and remove table entries.
If the action involves complex method invocations then I tend to use the Chain of Responsibility design pattern. I'll build a list of objects that each knows how to handle the actions for a particular case.
You hand the item to be processed to the first handler object. If it knows what to do with the item it performs the action. If it doesn't, it passes the item off to the next handler in the list. This continues until the item is processed or it falls into the default handler that cleans up or prints an error or whatever. Maintenance is simple -- you add or remove handler objects from the list.

You could define a subclass for each book type, and define a virtual function do_something. Each subclass A, B, etc would have its own version of do_something that it calls into, and do_some_default_thing then just becomes the do_something method in the base class.
Anyway, just one possible approach. You would have to evaluate whether it really makes things easier for you...

Strategy Design Pattern is what I think you need.

As an alternative to having a different class for each book, consider having a map from book types to function pointers. Then your code would look like this (sorry for pseudocode, C++ isn't at the tip of my fingers these days):
if book.type in booktypemap:
booktypemap[book.type]();
else
defaultfunc();

In C or C++, is there a way to extend a class without inheritance?

Is there a way to implement functionality like Class Categories (of Objective-C) or Extension Methods (of C# 3.0) in C and/or C++?

C++ has free functions, but sometimes extension methods work better when you nest many functions together. Take a look at this C# code:
var r = numbers.Where(x => x > 2).Select(x => x * x);
If we write this in C++ using free function it would look like this:
auto r = select(where(numbers, [](int x) { return x > 2; }), [](int x) { return x * x; });
Not only is this difficult to read, but it is difficult to write. The common way to solve this is to create what is called a pipable function. These functions are created by overloading the | pipe operator(which is just really the or operator). So the code above could be written like this:
auto r = numbers | where([](int x) { return x > 2; }) | select([](int x) { return x * x; });
Which is much easier to read and write. Many libraries use pipable function for ranges, but it could be expanded to other classes as well. Boost uses it in their range library, pstade oven uses it, and also this C++ linq library uses it as well.
If you would like to write your own pipable function, boost explain how to do that here. Other libraries, however, provide function adaptors to make it easier. Pstade egg has a pipable adaptor, and linq provides the range_extension adaptor to create a pipable function for ranges as least.
Using linq, you first just create your function as a function object like this:
struct contains_t
{
template<class Range, class T>
bool operator()(Range && r, T && x) const
{ return (r | linq::find(x)) != boost::end(r); };
};
Then you initialize the function using static initialization like this:
range_extension<contains_t> contains = {};
Then you can use your pipable function like this:
if (numbers | contains(5)) printf("We have a 5");

Not really. It's not the C++ way to treat classes like this.
Amongst others, Meyers argue that it's best to have a small class with the minimal set of operations that make it fully useful. If you want to expand the feature set, you may add an utility namespace (e.g. namespace ClassUtil) that contains non-member utility functions that operate on that minimal class. It's easy to add functions to a namespace from anywhere.
You can check a discussion on the subject here.

C++ doesn't have sealed classes or single class inheritance, so in most cases you can subclass the base class. There are creative ways to make a class non-inheritable, but they are few and far in between. In general, C++ doesn't have the problems C# does that gave birth to extension methods.
C is not Object Orientated, so the question doesn't really apply.

With regard to C#'s extension methods: Not directly. C++ has less need for these things because C++ supports free functions. I've never used Objective-C so I can't comment there.

Can you use an interface? Extension methods are an easy way to avoid subclassing, but they are rendered semi-useless when proper OO techniques are used. The reason that they are used with Linq so much is so that the VS team did not have to go and update code that would most likely break a lot of legacy applications.
Per MSDN:
"In general, we recommend that you implement extension methods sparingly and only when you have to. Whenever possible, client code that must extend an existing type should do so by creating a new type derived from the existing type."
http://msdn.microsoft.com/en-us/library/bb383977.aspx

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js