Functional data structures in C++ - c++

Does anyone know of a C++ data structure library providing functional (a.k.a. immutable, or "persistent" in the FP sense) equivalents of the familiar STL structures?
By "functional" I mean that the objects themselves are immutable, while modifications to those objects return new objects sharing the same internals as the parent object where appropriate.
Ideally, such a library would resemble STL, and would work well with Boost.Phoenix (caveat- I haven't actually used Phoenix, but as far as I can tell it provides many algorithms but no data structures, unless a lazily-computed change to an existing data structure counts - does it?)

I would look and see whether FC++ developed by Yannis Smaragdakis includes any data structures. Certainly this project more than any other is about supporting a functional style in C++.

This is more of a heads up than a detailed answer, but Bartosz Milewski appears to have done a lot of work on this. See, for example:
http://bartoszmilewski.com/2013/11/13/functional-data-structures-in-c-lists/
Looks like he's implemented a lot of algorithms from Okasiki's book Purely Functional Data Structures here:
https://github.com/BartoszMilewski/Okasaki
N.B. I haven't tried these yet, but they're the first C++ persistent data structures I've seen outside of FC++.
Hopefully, I'll get to trying them soon.

Related

How to go from a handle contained in "sub" structure in C to a simple object in C++?

The question is in the title but I think it deserves some explanation as it can be very unclear :
I must rewrite in C++ an API currently written in C. The parameters taken in the functions can be handles, contained in a structure of structures (of structures)...
It means that, to manipulate a handle, the user of the API must write something like : getHandleValue(struct1.subStruct1.myHandle);
One of my main objectives by rewriting the code in C++ is to implement all of this in Object Oriented style.
So I'd like something like : myObject->getValue; it's also to avoid the tedious calling of the handle with all the structures and sub structures (reminder : struct1.subStruct1.myHandle)
The main issue I encounter is that two handles from two different subStructures can have the same name. Same for the subStructures, two can have the same name in two different structures.
So I have that question:
Is it possible to forget the tedious calling with all the . and make the type of calling I want possible ? if it's not with an object, is it possible with a simple handle(getHandleValue(myHandle)), somehow "hiding" the whole actual address of the handle to the user ?
And in any cases, when you call handle1 for instance, how can you tell you call the handle1 from subStructure1 or the handle1 from subStructure2 ?
If you wanted to make your question more useful for both yourself and others, you'd probably need to tell us a bit more about the problem domain, and what the API is for. As it stands, it's a question whose original form would not be useful to anyone, yourself included, since its narrow scope bypasses everything that you really would like to know but don't know yet that you need to know :) You don't want to make the question too wide in scope, since then it may become off-topic on SO, thus your application-specific details would be needed. I'm sure you could present them in a generic way so that you wouldn't spill any secrets - but we do need to know the "concrete shape" of the problem domain whose API you'd be reimplementing.
It's a trivial task as presented, but it's up to you to decide which handle is actually needed, so if multiple handles have the same name, you have to distinguish between them somehow, e.g. by using different getter method names:
auto MyClass::getBarHandle() const { return foo.bar.h1; }
auto MyClass::getBazHandle() const { return foo.baz.h1; }
Alas, you don't really want the answer to this detail yet - the implementation details have obscured the big picture here, and this is a classical XY problem. I'd be very leery of assuming that the concept of low-level "handles" needs to be captured directly in your C++ API. It may be that iterators, object references and values are all that the user will need - who knows at this point. This has to be a conscious choice, not just parroting the C API.
You're not "porting" an API to C++. There's no such thing. Whoever uses such a term has no idea what they are talking about. You have to design a new API in C++, and then reuse the C code (or even the C API as-is, if needed) to implement it. Thus you need to understand the C++ idioms - how anyone writing C++ expects a C++ API to behave. It should be idiomatic C++. Same could be said of any expressive high level language, e.g. if you wanted to have a Python API, it should be pythonic (meaning: idiomatic Python), and probably far removed from how the C API might look.
Points to consider (and that's necessarily just a fraction of what you need to think about):
iterator support so that your data structures can be traversed - and that must work with range-for, otherwise your API will be universally hated.
useful range/iterator adapters and predicate functions , so that the data can be filtered to answer commonly asked questions without tedium (say you want to iterate over elements that fulfill certain properties).
value semantics support where appropriate, so that you don't prematurely pessimize performance by forcing the users to only store the objects on the heap. Modern C++ is really good at making value types useful, so the "everything is accessed via a pointer" mindset is rather counterproductive.
object and sub-object ownership - this ties into value semantics, too.
appropriate support of both non-modifying and modifying access, i.e. const iterators, const references, potential optimizations implied by non-modifying access, etc.
see whether PIMPL would be helpful as an implementation detail, and if so - does it make sense to leverage it for implicit sharing, while also keeping in mind the pitfalls.
You need to have real use cases in mind - ways to easily accomplish complex tasks using the power of the language and its standard library - so that your API won't be in the way. A good C++ API will not resemble its counterpart C API at all, really, since the level of abstraction expected of C++ APIs is much higher.
implement all of this in Object Oriented style.
The task isn't to write in some bastardized "C with objects" language, since that's not what C++ is all about. In C++, all encapsulated data types are classes, but that doesn't mean much - in C you also would be operating on objects, and a good C API would provide a degree of encapsulation too. The term "object" as it applies to C++ usually means a value of some type, and an integer variable is just as much an object as std::vector variable would be.
It's a task that starts at a high level. And once the big picture is in place, the details needed to fill it in would become self-evident, although this certainly requires experience in C++. C++ APIs designed by fresh converts to C++ are universally terrible unless said converts are mentored to do the right thing or have enough software engineering experience to explore the field and learn quickly. You'd do well to explore various other well-regarded C++ APIs, but this isn't something that can be done in one afternoon, I'm afraid. If your application domain is similar to other products that offer C++ APIs, you may wish to limit your search to that domain, but you're not guaranteed that the APIs will be of high quality, since most commercial offerings lag severely behind the state of the art in C++ API design.
#Unslander Monica :
First, thanks for your fast and dense answer. There's a lot of useful information and some technical terms I didn't know about so thanks very much !
You're not "porting" an API to C++. There's no such thing. Whoever uses such a term has no idea what they are talking about.
I didn't say I was porting the API, I just said that I was rewriting it, doing another version in a different language. And yes, I'm a "fresh convert" as you say but I'm not a complete ignorant. :)
I did do a high level work, for instance I made a class diagram and use cases. I also put myself in a user's shoes and called the API functions the way I'd see it.
But, now that it comes to the implementation, I ask myself some questions of feasibility. The question I asked in my publication was more a question of curiosity than a distress call...
Anyway, as you guessed I can't talk much about my project since it's private. But what I can do is give you the big picture
Currently : This is generated automatically from a XML file. We parse it, then create the following type of structure :
struct {
HANDLE hPage;
struct {
HANDLE hLine1;
struct {
HANDLE hWord;
}tLine1;
HANDLE hLine2;
struct {
HANDLE hWord;
}tLine2;
}tPage;
}tBook;
The user then calls any object via its handle. For example getValue(tBook.tPage.tLine2.hWord);
This is in C. In C++, it won't be structures but classes with a collection of objects defined by me. The class Page will have a collection of Lines for instance.
class Page {
private :
list<Line> lines;
}
The functions available for the user are mostly basic ones (set/get value or state, wait...) The API's job is to call with its functions, several functions from diverse underlying software components.
Concerning your remarks,
Thus you need to understand the C++ idioms - how anyone writing C++ expects a C++ API to behave. It should be idiomatic C++.
I've already thought of ways to introduce RAII, STL lib, smart pointers, overloaded operators... etc
iterator support so that your data structures can be traversed - and that must work with range-for
What do you mean by "range-for" ? Do you mean range-based for loops ?
so the "everything is accessed via a pointer" mindset is rather counterproductive.
That's more the philosophy of the current API in C, not mine :)
The task isn't to write in some bastardized "C with objects" language
No of course. But the current API's functioning is very, very hard to understand and some functions are really dense and sometimes too much complicated to even rewrite them in a different way.
For timing constraints, unfortunately I won't be able to adapt all of the API and my first thoughts when I saw the code is "OK... how do I do it in C++ ? In C, it's handles stocked in structures, in C++ it would be classes stocking handles, directly objects ?" Hence me saying "rewrite it Object Oriented style" ;) sorry if that came out wrong
Also you're right about exploring other APIs, that's what I've been doing with Qt framework. And, I lack C++ experience, that's why I come here, maybe I'm missing something simple here, or something I just don't know yet !
I'm here to learn, because I don't want to make a "terrible API", just like you said in your pep talk... ;)
Anyway, I hope that this answer helps you to understand a little more my problem!

Intrusive algorithms equivalents in Rust

I'm looking at the Rust programming language and trying to convert my C++ thinking to Rust. Common data structures such as lists and trees and have previously been implemented with pointers in C++, and I'm not sure how implement the exact equivalents in Rust. The data structures I'm interested in are the intrusive algorithms, similar to what is found in Boost intrusive libraries, and these are useful in embedded/system programming.
The linked list example in Rust (Dlist) is pretty much straight forward, but it uses a container type where the actual type is inside the container. The intrusive algorithm I'm looking for is a little bit the other way around: you have a main type where the list node is inserted or inherited.
Also, the famous linked list in Linux is also another example where the list data is in the members of the structures. This is like Boost member variant of the intrusive algorithms. This enables that you use your type in several lists/trees many times. How would this work with Rust?
So I'm unsure how to convert these kind of design patterns to Rust that I'm used to in C/C++. Anyone who had any success understanding this?
Rust wants you to think about ownership and lifetimes. Who owns the members and how long will they live?
In the question of Dlist, the answer is 'the container'. With intrusive algorithms there is no clear answer. Members of one list might be reused in another list, while others get destroyed with the first list. Ultimately, you probably want to use reference counting (std::sync::Arc).
I think there are two ways to accomplish something like that in Rust. Let's take a look at implementation of graphs, which typically use intrusive links.
The first approach relies on Rc<RefCell<Node>>. You can find more details here: Graphs and arena allocation
The second approach relies on vector indexes. You can find more information here: Modeling Graphs in Rust Using Vector Indices.
I believe the second approach is better, but I have not done any testing.

C++ Data structures API Questions

What C++ library provides Data structures API that match the ones provided by java.util.* as much as possible.
Specifically, I am looking for the following DS and following Utility Functions:-
**DS**: Priority Queue, HashMap, TreeMap, HashSet,
TreeSet, ArrayList, String most importantly.
**Utility**: Arrays.* , Collections.*, Regex, FileHandling etc.
and other converters and algorithms like Binary Search, Sort, NthElement etc.
My guess is that Boost may be able to do all these, but I find it too bulky and is non-trivial to add it into a project, especially, when I want to quickly get started on something and when although the code would require all these data structures, the code overall is not going to be that huge to warrant spending lot of effort in setting up libraries.
An example would be if someone had to write a C++ program to do Network Flow Algorithm for a school assignment. I am sure I could come up with better examples, but this one's on top of my head.
Thanks
Ajay
All of those containers are available in some form in the SC++L:
Priority Queue std::priority_queue (this is actually a container adapter, rather than a container itself - that is, it works "on top of" another container, usually std::vector or std::deque.
HashMap std::unordered_map (or if your compiler doesn't support C++0x, there's boost::unordered_map)
TreeMap std::map
HashSet and TreeSet are basically the same as HashMap and TreeMap, except the key and value are the same thing. However, there's also std::unordered_set and std::set.
ArrayList is the venerable std::vector
String is the venerable std::string. Many of the functions you get in the Java String class can be found in the Boost.Strings library.
Do not be afraid of setting up boost. In my experience, you set it up once and then use it over and over again in all of your projects. Also, all of the libraries that I mentioned above are header-only libraries. That means, you don't actually need to build/install any libraries, just references the headers.
For the other things, I'm not so sure, since I don't know Java all that well. At the end of the day, you're not going to find a library that's "just like Java, except written in C++" because that would be kind of pointless. A C++ library is written to play to C++'s strength, a Java library is written to play to Java's strengths. To try and shoehorn a library designed for one language into another doesn't make sense to me.

What is the point of STL?

I've been programming c++ for about a year now and when i'm looking about i see lots of references to STL.
Can some one please tell me what it does?
and the advantages and disadvantageous of it?
also what does it give me over the borlands VCL or MFC?
thanks
It's the C++ standard library that gives you all sorts of very useful containers, strings, algorithms to manipulate them with etc.
The term 'STL' is outdated IMHO, what used to be the STL has become a large part of the standard library for C++.
If you are doing any serious C++ development, you will need to be familiar with this library and preferably the boost library. If you are not using it already, you're probably working at the wrong level of abstraction or you're constraining yourself to a small-ish subset of C++.
STL stands for Standard Template Library. This was a library designed mainly by Stepanov and Lee which was then adopted as part of the C++ Standard Library. The term is gradually becoming meaningless, but covers these parts of the Standard Library:
containers (vectors, maps etc.)
iterators
algorithms
If you call yourself a C++ programmer, you should be familiar with all of these concepts, and the Standard Library implementation of them.
The STL is the Standard Template Library. Like any library it's a collection of code that makes your life easier by providing well tested, robust code for you to re-use.
Need a collection (map, list, vector, etc) they're in the STL
Need to operate on a collection (for_each, copy, transform, etc,) they're in the STL
Need to do I/O, there's classes for that.
Advantages
1, You don't have to re-implement standard containers (cus you'll get it wrong anyway)
Read this book by Nicolai M.Josuttis to learn more about the STL, it's the best STL reference book out there.
It provides common useful tools for the programmer! Iterators, algorithms, etc. Why re-invent the wheel?
"advantages and disadvantageous" compared to what? To writing all that code yourself? Is not it obvious? It has great collections and tools to work with them
Wikipedia has a good overview: http://en.wikipedia.org/wiki/Standard_Template_Library
The STL fixes one big deficiency of C++ - the lack of a standard string type. This has cause innumerable headaches as there have been thousands of string implementations that don't work well together.
It stands for standard template library
It is a set of functions and class that are there to save you a lot of work.
They are designed to use templates, which is where you define a function, but with out defining what data type it will work on.
for example, vector more or less lets you have dynamic arrays. when you create an instance of it, you say what type you want it to work for. This can even be your own data type (class).
Its a hard thing to think about, but it is hugely powerful and can save you loads of time.
Get reading up on it now! You want regret it.
It gives you another acronym to toss around at cocktail parties.
Seriously, check the intro docs starting e.g. with the Wikipedia article on STL.
The STL has Iterators. Sure, collections and stuff are useful, but the power iterators is gigantic, and, in my humble opinion, makes the rest pale in comparison.

how do i get started using boost

I hear a lot about boost here and I am beginning to think it could help a lot with my software development. More so in concurrency and memory management in my particular case as we have had a lot of bugs in this area.
What are the key language features I need to polish up on to effectively benefit from using boost and to shorten the learning curve? I have seen that function objects are commonly used so I would probably need to polish up on that.
Additionally, are there any tutorials and 101 resources I can quickly look at to just get a feel and understanding on using boost.
I realise there is a lot boost offers and I have to pick the right tools for the right job but any leads will help.
Related
How to learn boost (no longer valid; HTTP return status 404)
Boost has an unimaginable number of libraries.
Easy ones to get started on are
noncopyable
array
circular_buffer
foreach
operators (one of my personal favorites)
smart_ptr
date_time
More advanced ones include
lambda
bind
iostreams
serialization
threads
Getting used to boost takes time, but I assure you it will make your life much better. Plus, looking at how the boost libraries are coded will help you get better at c++ coding, especially templates.
You mentioned what should you look up before trying boost. I agree that function objects are a great thing to research. Also, make sure to look up about template programming. A common problem to make sure you know is when to use the typename qualifier for dependent types. For the most part, however, the libraries are very well documented, with examples and reference materials.
Learning boost is discussed here. As for language features that are useful? All of them. C++ is a dangerous language to use if you don't know enough of it. RAII, functors/function objects and templates probably cover the most important aspects. Boost is designed similarly to the STL, so knowing your standard library is essential. Boost itself uses a lot of template metaprogramming, but as a library user, you won't often need that (unless you start playing with Boost.MPL)
Bugs related to memory management are a good indicator that it's C++, rather than Boost you need to brush up on. The techniques for handling memory safely are well known, and not specific to Boost. (With the obvious exception of Boost's smart pointers). RAII is probably the most important concept to understand to deal with this kind of issues.
What are the key language features I need to polish up on to effectively benefit from using boost and to shorten the learning curve?
Templates
Functors
Exceptions
STL
Iterators
Algorithms
Containers
... among others.
are there any tutorials and 101 resources I can quickly look at to just get a feel and understanding on using boost.
Boost is well documented. Start here.
There are too many libraries to get lost. I'd say start with something simple, maybe smart pointers or Boost.Test (Unit Test framework) -- which will quickly help you get started. Also, try to think of a problem you cannot solve with the STL easily. Then look up Boost documentation or post here.
If you are comfortable with functional programming look at MPL/Lambda libraries.
The first ting IMO are smart pointers. Integration into new code is simple, and usually not a problem for existing code. They make memory management easy, and work for many other ressources, too.
C++ gives you the power to manage your own memory, smart pointers let you (mostly) wing it when you don't need to.
The second would be - as you mentioned - function objects, they close a big gap within C++ that is traditionally solved through inheritance, which is to strong of a coupling in many cases.
I have only little experience with boost outside these two, but most of the remainder is fairly "situational" - you may or may not need it. Get an overview over the libraries, and see what you need.
boost::any and boost::variant are good of you need a variant data type, with two different approaches.
boost::regex if you need some text parsing.
boost::thread and boost::filesystem help you write portable code. If you already have good platform specific libraries, you might not need them - but they are better than API or C++ level in any case.
Maybe you like my introduction to boost smart pointers, and a rather unorthodox use for them.
Try Björn Karlsson's book: Beyond the C++ Standard Library: An Introduction to Boost. Its pretty straightforward and easy to grasp. I read this after I'd finished Scott Meyers three c++ books (effective series).
After reading Beyond the C++ Standard Library: An Introduction to Boost, I would recommend casually browsing the documentation on boost.org, just to get an idea of what's available. You can do a deep dive into a specific boost library when it looks like a good fit for a particular application.
I think shared_ptr sould be the easiest place to start .
Start using it inplaces of simple pointer or auto_ptr data types.
You can also look into weak_ptr.