C++ Library API Design issues

C++ Library API Design issues - c++

I am creating a C++ library for use by third parties. While I am familiar with creating C libraries I have little experience creating C++ libraries. My concern is that there are additional issues presented by C++ library APIs which I need to consider. Such as :
Exception handling across the API.
User access of class members for objects created by the library
User destruction of objects created by the library and vice versa.
Who knows what else ...
What must I consider above and beyond that which I must consider for C libraries?
Best Regards

C++ is a more complex language than C, so there are a lot more issues that you need to be aware of. There are always language neutral concerns like how to design a good public/private separation, documentation, versioning, maintaining backward compatibility, etc. But there also various C++-specific issues, such as const correctness, your use of templates, exceptions vs return codes, not exposing data members, your use of inheritance, considering copy constructors and assignment operators, use of pointers or references, default arguments, friends, use of inline, etc.
In full disclosure, I am the author of the book "API Design for C++". Without wanting to sound like I'm pushing the book, it does cover exactly the topic that you're asking about: how to design good APIs for C++. You can view the table of contents of the book to give you a good overview of the issues you should be considering. Also, the sample chapter includes a discussion of the pimpl idiom, which I personally like as a way to provide better encapsulation in C++.
http://www.apibook.com/blog/contents

Microsoft do provide design guidelines for class libraries, not sure if such exists for Linux as well, but these are general guidelines and should apply to various platforms.
http://msdn.microsoft.com/en-us/library/czefa0ke(v=vs.71).aspx

Related

why c++ allows pointers if they can generate problems like accessing private members?

I have read about c++ pointers and their pitfalls, one of which is that one can access private data members of other class objects using pointer hacks as mentioned here and here.
Surely pointers in c++ gives a lot of flexibility to the language but what's the use of it if it can hamper the core OOP features of the language like data hiding ? Is this really a trade-off between security & flexibility ?

C++ is first and foremost a low level language that extends traditional C with fancier language constructs. With few exceptions, nothing was removed.
C++ is not just an OO language. It is functional, declarative, procedural, structural amd object oriented. It can even be used as a portable assembler, like C often is.
Pointers are a thin abstraction on top of how CPUs access memory. Having access to raw pointers in-language enables a huge amount of efficient code. Every systems programming language permits access to raw pointers; sometimes guarded by "unsafe" blocks; and C++ is also a systems programming language.
If you are coming at C++ from a single perspective, and wondering why it seems strangely shaped for your goals, try looking at it from another direction.

C++ was originally built on the C language. It added features without removing any. C had pointers and therefore C++ has pointers. Compatibility with the C interface remains an essential feature of C++. Therefore pointers will never be not-allowed.
Besides compatibility, pointers are very useful. Introducing object oriented programming to a system programming language being the initial reason for C++, it should be noted that indirection is essential for implementing runtime polymorphism. Sure, there is another method of indirection in C++ - references, but references are more limited and cannot be used to implement all algorithms that require the use of pointers.
Pointers are also required to implement node based data structures such as trees or linked lists.
c++ pointers and their pitfalls, one of which is that one can access private data members of other class objects using pointer hacks as mentioned here
That trick isnt so much enabled by the pointer but by the use of reinterpret_cast. It would be much more sensible to question why that language feature is allowed.
and here
Both this and the previous are cases where the language does not allow breaking of encapsulation. The problem isn't the pointer. The problem is that C++ - like C - has undefined behaviour (UB) when certain rules are violated.
You might be wondering, why does a language have UB. The answer is that if the rules were required to be checked at runtime to avoid UB, the program would necessarily be slower than it could be when we assume that the program didn't make a mistake. Since C and C++ are low level system programming languages, performance was chosen at the cost of protecting against bad programmers.

There is no trade-off involving security. That's because there is nothing to trade, since C++ does not offer security. The language offers tools (like data hiding) and compilers offer guardrails (most warnings) to help you do the right thing, but if you are determined to do the wrong thing in your program, you have the freedom to do so. Vive la résistance! Watch program crash override and acid burn.
Besides, if you want to access private members, there is a much simpler method that does not involve undefined behavior: edit the header files, and either change "private" to "public" or add your class/function as a friend. (I'm not sure if this affects ABI. You might need to recompile libraries. I did say "simpler" not "quicker".) No pointers involved.

Why C++ is called federation of languages?

I was reading a tutorial on C++ and the following line came up. No other details were provided to explain further
C++ is a "federation of languages" and supports multi-paradigm programming, there are many options available to us.
What does it mean when C++ is called federation of language and also what is multi-paradigm programming?

This is the explanation from Effective C++ Third Edition 55 Specific Ways to Improve Your Programs and Designs
By Scott Meyers, Item 1: View C++ as a federation of languages.
Today's C++ is a multiparadigm programming language, one supporting a
combination of procedural, object-oriented, functional, generic, and
metaprogramming features. This power and flexibility make C++ a tool
without equal, but can also cause some confusion. All the "proper
usage" rules seem to have exceptions. How are we to make sense of such
a language?
The easiest way is to view C++ not as a single language but as a
federation of related languages. Within a particular sublanguage, the
rules tend to be simple, straightforward, and easy to remember. When
you move from one sublanguage to another, however, the rules may
change. To make sense of C++, you have to recognize its primary
sublanguages. Fortunately, there are only four:
C. Way down deep, C++ is still based on C. Blocks, statements, the preprocessor, built-in data types, arrays, pointers, etc., all come
from C. In many cases, C++ offers approaches to problems that are
superior to their C counterparts (e.g., see Items 2 (alternatives to
the preprocessor) and 13 (using objects to manage resources)), but
when you find yourself working with the C part of C++, the rules for
effective programming reflect C's more limited scope: no templates, no
exceptions, no overloading, etc.
Object-Oriented C++. This part of C++ is what C with Classes was all about: classes (including constructors and destructors),
encapsulation, inheritance, polymorphism, virtual functions (dynamic
binding), etc. This is the part of C++ to which the classic rules for
object-oriented design most directly apply.
Template C++. This is the generic programming part of C++, the one that most programmers have the least experience with. Template
considerations pervade C++, and it's not uncommon for rules of good
programming to include special template-only clauses (e.g., see Item
46 on facilitating type conversions in calls to template functions).
In fact, templates are so powerful, they give rise to a completely new
programming paradigm, template metaprogramming (TMP). Item 48 provides
an overview of TMP, but unless you're a hard-core template junkie, you
need not worry about it. The rules for TMP rarely interact with
mainstream C++ programming.
The STL. The STL is a template library, of course, but it's a very special template library. Its conventions regarding containers,
iterators, algorithms, and function objects mesh beautifully, but
templates and libraries can be built around other ideas, too. The STL
has particular ways of doing things, and when you're working with the
STL, you need to be sure to follow its conventions.

"Federation of languages" means the big breadth of diverse features and ways to apply the C++ language.
Multiparadigmatic languages combined paradigms. Examples are F-Sharp, OCaml and Swift. So it's a group of language-styles.

Yes this is from Effective C++. The writer is just saying that C++ grammar comes from a series of sub-languages. Read about it here.
As for multi-paradigm programming, it's the ability of a language to support more than one programming style. This allows flexibility for different tasks. A google search should answer this for you.

Link compatibility between C++ and D

D easily interfaces with C.
D just as easily interfaces with C++, but (and it's a big but) the C++ needs to be extremely trivial. The code cannot use:
namespaces
templates
multiple inheritance
mix virtual with non-virtual methods
more?
I completely understand the inheritance restriction. The rest however, feel like artificial limitations. Now I don't want to be able to use std::vector<T> directly, but I would really like to be able to link with std::vector<int> as an externed template.
The C++ interfacing page has this particularly depressing comment.
D templates have little in common with
C++ templates, and it is very unlikely
that any sort of reasonable method
could be found to express C++
templates in a link-compatible way
with D.
This means that the C++ STL, and C++
Boost, likely will never be accessible
from D.
Admittedly I'll probably never need std::vector while coding in D, but I'd love to use QT or boost.
So what's the deal. Why is it so hard to express non-trivial C++ classes in D? Would it not be worth it to add some special annotations or something to express at least namespaces?
Update:
"D has namespace support in the works" from Walter Bright.

FWIW Qt has an actively developed binding for D: http://www.dsource.org/projects/qtd
I think many components in boost are too highly tied to C++ to be portable meaningfully to other languages.
Using e.g. std::vector is possible if you spend the time on writing regular (e.g. namespace-level) functions that forward to the appropriate member functions. Tedious, but affordable (for higher-level components; probably not for std::vector).
Also, I very recently checked in the standard library a sealed array and sealed binary heap implementation that use reference counting, malloc/free, and deterministic destruction instead of garbage collection (see http://www.dsource.org/projects/phobos/browser/trunk/phobos/std/container.d). Other containers will follow. These containers use three specific techniques (described in my upcoming article "Sealed Containers") to achieve deterministic destruction without compromising program safety.
Hopefully sealed containers will obviate any need to link with STL containers, even for tight applications that can't afford garbage collection.

As Hans Passant mentions in a comment, the level of interoperability you want between D and C++ isn't even supported among different C++ compilers. There is a C++ ABI (Application Binary Interface) standard that seems to have some support, but I'm not sure exactly how extensive (Intel, GCC and ARM compilers seem to follow the ABIs). I haven't had a need to use it, and I'm not sure whether or not Microsoft adheres to it for their x86 or x64 compilers (I imagine it might for ia64, since that's where the ABI standard started).
See "Interoperability & C++ Compilers" by Joe Goodman for some details on the C++ ABI.
Maybe Walter Bright can be convinced to support D interoperability with C++ libraries/objects that follow the ABI standard (though I'm not sure where he might prioritize it).

Just for fun I've been interfacing to some C++ code.
That probably won't answer for your question, but I think you (and some folks among the D community) might be interested in some notes I've made then. Here it goes: TechNotes.

Official C++ language subsets

I mainly use C++ to do scientific computing, and lately I've been restricting myself to a very C-like subset of C++ features; namely, no classes/inheritance except complex and STL, templates only used for find/replace kinds of substitutions, and a few other things I can't put in words off the top of my head. I am wondering if there are any official or well-documented subsets of the C++ language that I could look at for reference (as well as rationale) when I go about picking and choosing which features to use.

There is Embedded C++. It sounds mostly similar to what you're looking for.

Google publishes its internal C++ style guide, which is often referred to as such a subset: https://google.github.io/styleguide/cppguide.html . Ben Maurer, whose company reCAPTCHA was acquired by Google, describes it as follows in this post on Quora:
You can basically think of Google's
C++ subset as C plus a bit of sugar:
The ability to add methods to structs
Basic single inheritance.
Collection and string classes
Scope based resource management.
They also publish a lint tool, cpplint.py.

Not long ago I listened to this SE-Radio podcast - Episode 152: MISRA with Johan Bezem, which introduces MISRA, standard guidelines for C and C++ to ensure better quality, try looking at it.

The GCC developers are about to allow some C++ features. I'm not aware of any official guidelines, yet, but I am pretty sure that they will define some. Have a look at initial report on the mailing list.

Well, latest developments (TR1, C++0x) in C++ made it very much generic, allowing you to do imperative, OOP or even (limited) functional programming in C++. Libraries like Boost also enable you to do very power declarative template-based meta-programming.
I think Boost is the first thing to try out in C++. It's a comprehensive library, which also includes several modules that enable you to program in functional style (Boost.Functional) or making compile-time declarative meta-programming (Boost MPL).

OpenCL has been using C for writing kernels, but they have recently added (or will soon add) C++ bindings and perhaps Java. OpenCL leaves out a number of performance robbing features of C. Excluded are things like function pointers and recursion. Smart pointers and polymorphism also create overhead.
Restrictions on C:
SIMD programming languages
Slightly off topic: Here is a good discussion comparing OpenCL with CUDA using C.
OpenCL or CUDA Which way to go?

The SEI CERT C++ Coding Standard gives a list of rules for writing safe, reliable, and secure systems in C++14. This is not a subset of C++ per se, but as a coding standard like the other answers is a subset in effect by avoiding unsafe, undefined, or easily-misused features (including some common to C).

What is the preferred naming conventions du jour for c++?

I am quite confused by looking at the boost library, and stl, and then looking at people's examples. It seems that capitalized type names are interspersed with all lowercase, separated by underscores.
What exactly is the way things should be done these days? I know the .NET world has their own set of conventions, but it appears to be completely different than the C++ sphere.

What a can of worms you've opened.
The C++ standard library uses underscore_notation for everything, because that's what the C standard library uses.
So if you want your code to look consistent across the board (and actually aren't using external libraries), that is the only way to go.
You'll see boost use the same notation because often their libraries get considered for future standards.
Beyond that, there are many conventions, usually using different notations to designate different types of symbols. It is common to use CamelCase for custom types, such as classes and typedefs and mixedCase for variables, specifically to differentiate those two, but that is certainly not a universal standard.
There's also Hungarian Notation, which further differentiates specific variable types, although just mentioning that phrase can incite hostility from some coders.
The best answer, as a good C++ programmer, is to adopt whatever convention is being used in the code you're immersed in.

There is no good answer. If you're integrating with an existing codebase, it makes sense to match their style. If you're creating a new codebase, you might want to establish simple guidelines.
Google has some.

It's going to be different depending on the library and organization.
For the developer utility library I'm building, for example, I'm including friendly wrapper modules for various conventions in the style of the convention. So, for example, the MFC wrapper module uses the 'm_typeMemberVariable' notation for members, whereas the STL wrapper module uses 'member_variable'. I'm trying to build it so that whatever front-end is used will have the style typical for that type of front-end.
The problem with having a universal style is that everyone would have to agree, and (for example) for every person who detests Hungarian notation, there's someone else who thinks not using Hungarian notation detracts from the basic value of comprehensibility of the code. So it's unlikely there will be a universal standard for C++ any time soon.

Find something you feel comfortable with and stick with it. Some form of style is better than no style at all and don't get too hung up on how other libraries do it.
FWIW I use the Google C++ style guide (with some tweaks).
http://google-styleguide.googlecode.com/svn/trunk/cppguide.xml

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js