Lambda <-> closure equivalence in C++ - c++

I was watching a video called Lambda? You Keep Using that Letter by Kevlin Henney, where he states that closures and objects are fundamentaly equivalent:
He then proves his point by this javascript code which implement a stack as a closure:
const newStack = () => {
const items = []
return {
depth: () => items.lengh,
top: () => items[0],
push: newTop => { items.unshift(newTop) },
pop: () => { items.shift() },
}
}
The advantage of a closure versus a class is that its state is really hidden, whereas a private member is more "inaccessible" than "hidden".
I tried to do something equivalent in C++. However, it seems that it is difficult to express this in C++.
My current version is there, and it has two major drawbacks:
it does compile, but it will not work (the inner shared_ptris released immediately after the closure creation)
is a bit verbose : depth, top, push and pop are repeated 3 times.
auto newStack = []() {
auto items = std::make_shared<std::stack<int>>();
auto depth = [&items]() { return items->size();};
auto top = [&items]() { return items->top(); };
auto push = [&items](int newTop) { items->push(newTop); };
auto pop = [&items]() { items->pop(); };
struct R {
decltype(depth) depth;
decltype(top) top;
decltype(push) push;
decltype(pop) pop;
};
return R{ depth, top, push, pop};
};
godbolt version here
Is there a working way to do it in C++?

Yes, of course there's a better way to do it in C++: don't use a lambda.
A lambda expression defines a class. A closure is an instance of that class--an object. We don't need comparisons to other languages to tell us that--it's exactly how lambdas and closures are defined in C++. §[expr.prim.lambda.closure]:
The type of a lambda-expression (which is also the type of the closure object) is a unique, unnamed non-union class type, called the closure type, whose properties are described below.
But (and this is an important point) at least in C++, a lambda expression defines a class with a very limited public interface. Specifically, it provides an overload of operator(), and if it doesn't capture anything, a conversion to pointer to function. If it does capture something, it also defines a constructor to do the capturing. And of course if it captures things, it defines member variables to hold whatever it captures.
But that's all it really defines. It's not that it's doing a better job of hiding whatever else it may contain. It's that it really doesn't contain anything else.
In your case, you're trying to define a type that has four separate member functions that all operate on some state they share. As you've shown, it's sort of possible to do that externalizing the state so you have something nearly equivalent to some C code (or something on that order) that simply has data and some functions that operate on that data. And yes, you can push them together into a structure to do at least some imitation of a class with member functions.
But you're pretty much fighting against the system (so to speak) to do that in C++. Lambdas/closures (as they're defined in C++) are not intended to let you define things that have multiple separate entry points, each carrying out separate actions on shared data. As Samuel Johnson's old line says, "[It] is like a dog walking on its hind legs. It is not done well; but you are surprised to find it done at all."

Related

Is it possible to parameterize a generic with a type that is itself generic? [duplicate]

Is it possible to do something like this in Rust?
trait Foo<T> {}
struct A;
struct B;
struct Bar<T: Foo> {
a: T<A>,
b: T<B>
}
I know I could just use two parameters for Bar, but I think there has to be a better way to do this.
I want to implement a Graph structure. As I can't just bind the nodes and edges to their parents lifetime, I want to have something like Rc. However, sometimes one may need a Graph with access from multiple threads. So I'd have to have both an implementation with Rc and Arc.
That's what Foo is good for: I implement Foo for both Rc and Arc (Foo would require Deref) and I use a parameter T bound to Foo. That's how I wanted to have one struct for single thread and multi thread usage.
⇒ This is currently impossible to express in Rust's type system ☹
Fortunately, it will be possible in the future thanks to "Generic Associated Types" as proposed in this RFC. You can track the status of implementation and stabilization in the corresponding tracking issue.
The important term here is "HKT" (higher kinded types). It's a feature of a type system which is not yet implemented in Rust. Haskell offers HKTs. In the C++ world HKTs are known as "template templates". The generic associated types mentioned above are also a form of HKTs.
But what are HKTs, really?
Let's start slowly: what is a simple type as we know it? Let's list some types: i32, bool, String. These are all types... you can have a value (variable) of these types. What about Vec<i32>? It's also a simple type! You can have a variable of type Vec<i32>, no problem!
We want to group these types together; we call this categorisation a "kind of a type". If we want to talk in a very abstract way (about types of types) we choose other words, kind in this case. There is even a notation for kinds of types. For our simple types from above, we say: the kind of those types is
*
Yes, just a star, very easy. The notation makes more sense later!
Let's search for types that are of a different kind than our simple types. Mutex<HashMap<Vec<i32>, String>>? Nope, it's fairly complex maybe, but it's still of kind * and we still can have a variable of that type.
What about Vec? Yes, we omitted the angle-brackets. Yes, this is indeed another kind of type! Can we have a variable of type Vec? No! A vector of what?!
This kind is donated as:
* -> *
This just says: give me a normal type (*) and I will return a normal type! Give a normal type i32 to this thing (Vec) and it will return a normal type Vec<i32>! It's also called a type constructor, because it is used to construct types. We can even go further:
* -> * -> *
This is a bit strange, because it has to do with currying and reads odd for a non-Haskell programmer. But it means: give me two types and I will return a type. Let's think about an example... Result! The Result type constructor will return a concrete type Result<A, B> after you provided two concrete types A and B.
The term higher kinded types just refers to all kinds of types which are not *, which are type constructors.
In your example
When you write struct Bar<T: Foo> you want T to be of the kind * -> *, meaning: you can give one type to T and receive a simple type. But as I said, this is not yet expressible in Rust. To use a similar syntax, one might imagine that this could work in the future:
// This does NOT WORK!
struct Bar<for<U> T> where T<U>: Foo {
a: T<A>,
b: T<B>,
}
The for<> syntax is borrowed from "higher-ranked trait bounds" (HRTB), which can be used today for abstracting over lifetimes (most commonly used with closures).
Links
In case you want to read more about this topic, here are some links:
Niko Matsakis' great series of blog posts discussing one possible solution (associated type constructors) to the HKT problem
The RFC proposing generic associated types (just a less scary name for "associated type constructors")
HRTB explanation
Bonus: the solution to your problem in case associated type constructors will be implemented (I think, as there is no way to test)!
We have to take a detour in our implementation since the RFC wouldn't allow to pass Rc as a type parameter directly. It doesn't introduce HKTs directly, so to speak. But as Niko argues in his blog post, we can have the same flexibility and power as HKTs with associated type constructors by using so called "family traits".
/// This trait will be implemented for marker types, which serve as
/// kind of a proxy to get the real type.
trait RefCountedFamily {
/// An associated type constructor. `Ptr` is a type constructor, because
/// it is generic over another type (kind * -> *).
type Ptr<T>;
}
struct RcFamily;
impl RefCountedFamily for RcFamily {
/// In this implementation we say that the type constructor to construct
/// the pointer type is `Rc`.
type Ptr<T> = Rc<T>;
}
struct ArcFamily;
impl RefCountedFamily for ArcFamily {
type Ptr<T> = Arc<T>;
}
struct Graph<P: RefCountedFamily> {
// Here we use the type constructor to build our types
nodes: P::Ptr<Node>,
edges: P::Ptr<Edge>,
}
// Using the type is a bit awkward though:
type MultiThreadedGraph = Graph<ArcFamily>;
For more information, you should really read Niko's blog posts. Difficult topics explained well enough, that even I can understand them more or less!
EDIT: I just noticed that Niko actually used the Arc/Rc example in his blog post! I totally forgot that and thought of the code above myself... but maybe my subconscious still remembered, as I choose a few names exactly as Niko did. Anyway, here is his (probably way better) take on the issue.
In a way Rust does have what looks a lot like HKT (see Lukas's answer for a good description of what they are), though with some arguably awkward syntax.
First, you need to define the interface for the pointer type you want, which can be done using a generic trait. For example:
trait SharedPointer<T>: Clone {
fn new(v: T) -> Self;
// more, eg: fn get(&self) -> &T;
}
Plus a generic trait which defines an associated type which is the type you really want, which must implement your interface:
trait Param<T> {
type Pointer: SharedPointer<T>;
}
Next, we implement that interface for the types we're interested in:
impl<T> SharedPointer<T> for Rc<T> {
fn new(v: T) -> Self {
Rc::new(v)
}
}
impl<T> SharedPointer<T> for Arc<T> {
fn new(v: T) -> Self {
Arc::new(v)
}
}
And define some dummy types which implement the Param trait above. This is the key part; we can have one type (RcParam) which implements Param<T> for any T, including being able to supply a type, which means we're simulating a higher-kinded type.
struct RcParam;
struct ArcParam;
impl<T> Param<T> for RcParam {
type Pointer = Rc<T>;
}
impl<T> Param<T> for ArcParam {
type Pointer = Arc<T>;
}
And finally we can use it:
struct A;
struct B;
struct Foo<P: Param<A> + Param<B>> {
a: <P as Param<A>>::Pointer,
b: <P as Param<B>>::Pointer,
}
impl<P: Param<A> + Param<B>> Foo<P> {
fn new(a: A, b: B) -> Foo<P> {
Foo {
a: <P as Param<A>>::Pointer::new(a),
b: <P as Param<B>>::Pointer::new(b),
}
}
}
fn main() {
// Look ma, we're using a generic smart pointer type!
let foo = Foo::<RcParam>::new(A, B);
let afoo = Foo::<ArcParam>::new(A, B);
}
Playground

Proper design for C++ class wrapping multiple possible types

I am trying to implement a C++ class which will wrap a value (among other things). This value may be one of a number of types (string, memory buffer, number, vector).
The easy way to implement this would be to do something like this
class A {
Type type;
// Only one of these will be valid data; which one will be indicated by `type` (an enum)
std::wstring wData{};
long dwData{};
MemoryBuffer lpData{};
std::vector<std::wstring> vData{};
};
This feels inelegant and like it wastes memory.
I also tried implementing this as a union, but it came with significant development overhead (defining custom destructors/move constructors/copy constructors), and even with all of those, there were still some errors I encountered.
I've also considered making A a base class and making a derived class for each possible value it can hold. This also feels like it isn't a great way to solve the problem.
My last approach would be to make each member an std::optional, but this still adds some overhead.
Which approach would be the best? Or is there another design that works better than any of these?
Use std::variant. It is typesafe, tested and exactly the right thing for a finite number of possible types.
It also gets rid of the type enum.
class A {
std::variant<std::wstring, long, MemoryBuffer, std::vector<std::wstring>> m_data{}; // default initializes the wstring.
public
template<class T>
void set_data(T&& data) {
m_data = std::forward<T>(data);
}
int get_index() { // returns index of type.
m_data.index();
}
long& get_ldata() {
return std::get<long>(m_data); // throws if long is not the active type
}
// and the others, or
template<class T>
T& get_data() { // by type
return std::get<T>(m_data);
}
template<int N>
auto get_data() { // by index
return std::get<N>(m_data);
}
};
// using:
A a;
a.index() == 0; // true
a.set_data(42);
a.index() == 1; // true
auto l = a.get<long>(); // l is now of type long, has value 42
a.get<long>() = 1;
l = a.get<1>();
PS: This example does not even include the coolest (in my opinion) feature of std::variant: std::visit I am not sure what you want to do with your class, so I cannot create a meaningful example. If you let me know, I will think about it.
You basically want QVariant without the rest of Qt, then :)?
As others have mentioned, you could use std::variant and put using MyVariant = std::variant<t1, t2, ...> in some common header, and then use it everywhere it's called for. This isn't as inelegant as you may think - the specific types to be passed around are only provided in one place. It is the only way to do it without building a metatype machinery that can encapsulate operations on any type of an object.
That's where boost::any comes in: it does precisely that. It wraps concepts, and thus supports any object that implements these concepts. What concepts are required depends on you, but in general you'd want to choose enough of them to make the type usable and useful, yet not too many so as to exclude some types prematurely. It's probably the way to go, you'd have: using MyVariant = any<construct, _a>; then (where construct is a contract list, an example of which is as an example in the documentation, and _a is a type placeholder from boost::type_erasure.
The fundamental difference between std::variant and boost::any is that variant is parametrized on concrete types, whereas any is parametrized on contracts that the types are bound to. Then, any will happily store an arbitrary type that fulfills all of those contracts. The "central location" where you define an alias for the variant type will constantly grow with variant, as you need to encapsulate more type. With any, the central location will be mostly static, and would change rarely, since changing the contract requirements is likely to require fixes/adaptations to the carried types as well as points of use.

Static initialization of function pointers

I'm having a class that contains a function pointer. I would like to initialize various instances of the class statically but I can't figure out the correct syntax for this.
Let's say, this is my class
class fooClass
{
int theToken;
string theOutput;
bool (*theDefault)( void );
};
I now would like to create a static instance of this, like this…
fooClass test
{
1,
"Welcome",
(){ return (theToken & 1 ) ? true : false; }
};
As I said, I can't figure out the proper syntax for the function pointer line. Or is it even possible like this? I'd really like not having to break out every function I create this way into its own function declaration.
What I'm trying to do is, allow each instance to have a unique default function because each instance represents a unique data-driven building block of a bigger system. The code I put in there is just for illustrative purposes. This default function will access certain global variables as well as some of the member variables and if need be I could pass this into the function.
Could someone point me in the right direction how I'd have to write the initialization for it to work under C++14?
If you want to refer to struct members inside the function, you cannot do with just a plain function pointer not receiving any argument, as it doesn't receive the this pointer.
My advice is to at very least change it to a pointer to a function taking the instance as an argument, then in initialization you can pass a capture-less lambda (which can be converted to a plain function pointer):
class fooClass
{
int theToken;
string theOutput;
bool (*theDefault)( fooClass *that);
// you may provide a helper for ease of use
bool Default() { return theDefault(this);}
};
fooClass test
{
1,
"Welcome",
[] (fooClass *that){ return (that->theToken & 1 ) ? true : false; }
};
You can also use an std::function<bool(fooClass*)> to allow even functors, lambdas with captures & co. if you are ok with the increased overhead.
You may be tempted to use a plain std::function<bool()> instead, and use a lambda capturing the instance by reference, such as
fooClass test
{
1,
"Welcome",
[&test] (){ return (test->theToken & 1 ) ? true : false; }
};
This does work, but is extremely dangerous if test happens to be copied, as theDefault will still refer to test even in the copy (and even after the original will have been destroyed).
(incidentally, this is how OOP is often done in languages such as Lua, but there (1) objects are not copied and (2) automatic memory management makes sure that closures "keep alive" the objects they capture)

Recursive functions with constant parameters

In C++ is there a preferred way of dealing with a recursive function that reuses an unchanged object every time? For example (pseudo code):
void func(HashMap h, Logger logger, int depth)
{
// Do stuff then call func recursively...
func(h, logger, depth+1);
}
So every call we pass in several objects (hash map and logger) unchanged. Yes we can pass by reference but it still seems inefficient and doesn't look very pretty. The only alternative I can think of is global variables or bundling up the constant parameters in a struct.
What you're describing is a closed lexical environment, aka a "closure". That concept exists "for free" in dynamic languages like Lisp and Scheme, where the "let over lambda" pattern allows you to capture variables in a function and use them even though their enclosing scope is gone or exited.
You can simulate closures in C++ using a struct as the container for the captured variables and apply functions (functors) to the struct. The most common pattern for this is typically the way that operator() is used:
struct my_closure {
Hashmap &hmap_;
Logger &logger_;
int depth_;
my_closure(Hashmap &hmap, Logger &logger, int depth) :
hmap_(hmap), logger_(logger), depth_(depth)
{ }
void operator()(void)
{ /* do stuff here, including recurse */ }
};
The only thing this will save you is pushing extra stuff on the stack, which doesn't necessarily save you much in terms of cycles (1 struct reference [this pointer for the struct] vs. 2 references + 1 integer.) If your compiler supports tail recursion, this could be a benefit at the expense of readability.
This is perfectly clean code that expresses exactly what is happening. Of course in C++ you should pass the HashMap and the Logger as const reference.
From a performance point of view, however it might be better to make these two objects class variables. But this could make the code harder to understand, so just do it if you really have a performance issue.
By the way, what is the function doing if it doesn't modify the HashMap and returns void?

Design advice: calling a method on the container object from the contained object

I have a simple setup with
class Container {
Handler h;
}
All the Container objects have a "warning()" method. I would like
to also have a way to output warnings from within the Handler object, but send
these warnings using the facilities of the containing object.
I do realize that holding a reference to the container in the contained object is odd (normally
the contained object should not know anything about it's container). Now, in a language
with closures I would have done it like so (imaginary syntax):
h.set_warning_handler { | char* message |
this->warning(message)
}
but I am working in C++ and it's not a place to use Apple dialect things like blocks.
What would be the preferred way to tackle this? Or just set that reference and forget about it?
C++11 has closures:
h.set_warning_handler([&](char const* message) { this->warning(message); });
[&] specifies to capture the context by reference (needed to capture this). (…) declares the argument list, and {…} the lambda’s body.
Alternatively, you can make the Handler dependent on its container. This introduces quite strong coupling so it’s better to be avoided but sometimes it makes sense (e.g. if you cannot use C++11 features yet), and the strong coupling can be weakened by using an interface (the following uses late binding; sometimes, templates might be more appropriate):
struct CanWarn {
virtual void warning(char const*) const = 0;
virtual ~CanWarn() { }
};
class Handler {
CanWarn const* warning_dispatcher;
public:
void set_warning_dispatcher(CanWarn const* dispatcher) {
warning_dispatcher = dispatcher;
}
…
};
class Container : public CanWarn { … };