Transpiling to C vs C++ : range of CPU instructions

Transpiling to C vs C++ : range of CPU instructions - c++

I am considering the question of transpiling a language (home-grown DSL) to C vs to C++.
I haven't done any 'native' programming for over 15 years, so I want to check my assumptions.
Am I right into assuming that transpiling to the newest C++ version (17) would enable the native compiler to use a much wider range of 'modern' Intel/AMD CPU instructions, resulting in a more efficient executable (beyond the multi-threading / memory-model part of C++, which already by itself seems a good enough reason to go for C++)?
Put another way, isn't a large part of 'more recent' CPU instructions never generated by a C compiler, simply because it has too little information about the programmer intent, due to the simpler syntax of C? I know I could access all CPU instructions with assembler, but that is precisely what I don't want to do. Ideally, I would want the generated code to still be as platform-independent as possible.

All of your assumptions about the relationship between programming language and "modern CPU instructions" are incorrect.
Let's consider the GNU Compiler Collection.
The choice of language here doesn't much matter, as the language front-ends all end up generating the same intermediate form called GIMPLE. The optimizing passes then work on that.
The range of CPU instructions which can be emitted is controlled by the -mtune option. For x86, GCC is capable of emitting modern AVX 512 instructions when optimizing some very plain-looking C code. Automatic loop vectorisation is a powerful thing. Try it out: implement memcpy and look at the generated assembly.
My advice: generate clean, un-clever C code, and crank up the optimization level. Just like you would do if writing code by hand.
You might also consider implementing your language directly as a front-end to GCC or LLVM, without transpiling to C or C++. LLVM was designed for this purpose, intended to make implementing new languages easy, and still taking advantage of modern optimization approaches.

Related

Do different versions of compilers (e.g GCC) generate different performance?

I have a question for a long time, i.e. whether the new version of C/C++ compiler generate better code with better performance (e.g. G++ 7.3 vs G++ 4.8)?
If they do, what is the source of speedup? If not, is it recommended to update the compilers?

Here's a short answer regarding GCC -- there's an extensive list of different benchmark results available on their home website.
For example, looking at a specific run of the OOPACK benchmark by Charles Leggett:
The OOPACK kernels consist of 4 programs to measure the relative performance of C++ compilers vs C compilers for abstract data types. The kernels are constructed in such a way that they can be coded in C or C++. The C programs are compiled by the C++ compiler.
The kernels consist of:
Max measures how well a compiler inlines a simple conditional.
Matrix measures how well a compiler propagates constants and hoists simple invariants.
Iterator measures how well a compiler inlines short-lived small objects.
Complex measures how well a compiler eliminates temporaries.
one of the conclusions reads:
gcc optimized C has somewhat improved between 2.91.66 and 3.x
As expected, having a quick look at some other benchmarks also seems to support the narrative that "newer is better".
Taking the categories from the "Design and Development Goals" listed in the GCC Development Mission Statement, the reasons for improvements fall into one of the three:
New optimizations
Improved runtime libraries
Various other infrastructure improvements
It is important to note that other goals involve "new languages" and "new targets" -- thus the relevance of a new version will be dependent on your use case.
Moreover, reading about the release criteria -- I'd warn against possibly misleading yourself by speaking about "better performance" in general, as compiler designs come with many trade-offs:
In contrast to most correctness issues, where nothing short of correct is acceptable, it is reasonable to trade off behavior for code quality and compilation time. For example, it may be acceptable, when compiling with optimization, if the compiler is slower, but generates superior code. It may also be acceptable for the compiler to generate inferior code on some test cases if it generates substantially superior code on other test cases.
Thus, especially with niche and performance-critical applications you might want to compare specific compiler versions
As a side note, you might find it interesting to read more about their development plan that includes the explanation of the version numbering etc.

Yes, newer versions of GCC generate better code and have better performance.
The speedup is from better code-generating algorithms written into GCC.
I would recommend upgrading GCC if there aren't compatibility issues. Newer GCC versions have fewer bugs and generate better code.
You may have to upgrade Binutils too if you upgrade GCC.
Just a note to clarify, this probably doesn't apply to any Microsoft products (see comments). As I don't have any experience with them, I don't know. In general, however, GCC has fewer bugs and better code with each release, which is why I wrote what I did.

Explanation of CUDA C and C++

Can anyone give me a good explanation as to the nature of CUDA C and C++? As I understand it, CUDA is supposed to be C with NVIDIA's GPU libraries. As of right now CUDA C supports some C++ features but not others.
What is NVIDIA's plan? Are they going to build upon C and add their own libraries (e.g. Thrust vs. STL) that parallel those of C++? Are they eventually going to support all of C++? Is it bad to use C++ headers in a .cu file?

CUDA C is a programming language with C syntax. Conceptually it is quite different from C.
The problem it is trying to solve is coding multiple (similar) instruction streams for multiple processors.
CUDA offers more than Single Instruction Multiple Data (SIMD) vector processing, but data streams >> instruction streams, or there is much less benefit.
CUDA gives some mechanisms to do that, and hides some of the complexity.
CUDA is not optimised for multiple diverse instruction streams like a multi-core x86.
CUDA is not limited to a single instruction stream like x86 vector instructions, or limited to specific data types like x86 vector instructions.
CUDA supports 'loops' which can be executed in parallel. This is its most critical feature. The CUDA system will partition the execution of 'loops', and run the 'loop' body simultaneously across an array of identical processors, while providing some of the illusion of a normal sequential loop (specifically CUDA manages the loop "index"). The developer needs to be aware of the GPU machine structure to write 'loops' effectively, but almost all of the management is handled by the CUDA run-time. The effect is hundreds (or even thousands) of 'loops' complete in the same time as one 'loop'.
CUDA supports what looks like if branches. Only processors running code which match the if test can be active, so a subset of processors will be active for each 'branch' of the if test. As an example this if... else if ... else ..., has three branches. Each processor will execute only one branch, and be 're-synched' ready to move on with the rest of the processors when the if is complete. It may be that some of the branch conditions are not matched by any processor. So there is no need to execute that branch (for that example, three branches is the worst case). Then only one or two branches are executed sequentially, completing the whole if more quickly.
There is no 'magic'. The programmer must be aware that the code will be run on a CUDA device, and write code consciously for it.
CUDA does not take old C/C++ code and auto-magically run the computation across an array of processors. CUDA can compile and run ordinary C and much of C++ sequentially, but there is very little (nothing?) to be gained by that because it will run sequentially, and more slowly than a modern CPU. This means the code in some libraries is not (yet) a good match with CUDA capabilities. A CUDA program could operate on multi-kByte bit-vectors simultaneously. CUDA isn't able to auto-magically convert existing sequential C/C++ library code into something which would do that.
CUDA does provides a relatively straightforward way to write code, using familiar C/C++ syntax, adds a few extra concepts, and generates code which will run across an array of processors. It has the potential to give much more than 10x speedup vs e.g. multi-core x86.
Edit - Plans: I do not work for NVIDIA
For the very best performance CUDA wants information at compile time.
So template mechanisms are the most useful because it gives the developer a way to say things at compile time, which the CUDA compiler could use. As a simple example, if a matrix is defined (instantiated) at compile time to be 2D and 4 x 8, then the CUDA compiler can work with that to organise the program across the processors. If that size is dynamic, and changes while the program is running, it is much harder for the compiler or run-time system to do a very efficient job.
EDIT:
CUDA has class and function templates.
I apologise if people read this as saying CUDA does not. I agree I was not clear.
I believe the CUDA GPU-side implementation of templates is not complete w.r.t. C++.
User harrism has commented that my answer is misleading. harrism works for NVIDIA, so I will wait for advice. Hopefully this is already clearer.
The hardest stuff to do efficiently across multiple processors is dynamic branching down many alternate paths because that effectively serialises the code; in the worst case only one processor can execute at a time, which wastes the benefit of a GPU. So virtual functions seem to be very hard to do well.
There are some very smart whole-program-analysis tools which can deduce much more type information than the developer might understand. Existing tools might deduce enough to eliminate virtual functions, and hence move analysis of branching to compile time. There are also techniques for instrumenting program execution which feeds directly back into recompilation of programs which might reach better branching decisions.
AFAIK (modulo feedback) the CUDA compiler is not yet state-of-the-art in these areas.
(IMHO it is worth a few days for anyone interested, with a CUDA or OpenCL-capable system, to investigate them, and do some experiments. I also think, for people interested in these areas, it is well worth the effort to experiment with Haskell, and have a look at Data Parallel Haskell)

CUDA is a platform (architecture, programming model, assembly virtual machine, compilation tools, etc.), not just a single programming language. CUDA C is just one of a number of language systems built on this platform (CUDA C, C++, CUDA Fortran, PyCUDA, are others.)
CUDA C++
Currently CUDA C++ supports the subset of C++ described in Appendix D ("C/C++ Language Support") of the CUDA C Programming Guide.
To name a few:
Classes
__device__ member functions (including constructors and destructors)
Inheritance / derived classes
virtual functions
class and function templates
operators and overloading
functor classes
Edit: As of CUDA 7.0, CUDA C++ includes support for most language features of the C++11 standard in __device__ code (code that runs on the GPU), including auto, lambda expressions, range-based for loops, initializer lists, static assert, and more.
Examples and specific limitations are also detailed in the same appendix linked above. As a very mature example of C++ usage with CUDA, I recommend checking out Thrust.
Future Plans
(Disclosure: I work for NVIDIA.)
I can't be explicit about future releases and timing, but I can illustrate the trend that almost every release of CUDA has added additional language features to get CUDA C++ support to its current (In my opinion very useful) state. We plan to continue this trend in improving support for C++, but naturally we prioritize features that are useful and performant on a massively parallel computational architecture (GPU).

Not realized by many, CUDA is actually two new programming languages, both derived from C++. One is for writing code that runs on GPUs and is a subset of C++. Its function is similar to HLSL (DirectX) or Cg (OpenGL) but with more features and compatibility with C++. Various GPGPU/SIMT/performance-related concerns apply to it that I need not mention. The other is the so-called "Runtime API," which is hardly an "API" in the traditional sense. The Runtime API is used to write code that runs on the host CPU. It is a superset of C++ and makes it much easier to link to and launch GPU code. It requires the NVCC pre-compiler which then calls the platform's C++ compiler. By contrast, the Driver API (and OpenCL) is a pure, standard C library, and is much more verbose to use (while offering few additional features).
Creating a new host-side programming language was a bold move on NVIDIA's part. It makes getting started with CUDA easier and writing code more elegant. However, truly brilliant was not marketing it as a new language.

Sometimes you hear that CUDA would be C and C++, but I don't think it is, for the simple reason that this impossible. To cite from their programming guide:
For the host code, nvcc supports whatever part of the C++ ISO/IEC
14882:2003 specification the host c++ compiler supports.
For the device code, nvcc supports the features illustrated in Section
D.1 with some restrictions described in Section D.2; it does not
support run time type information (RTTI), exception handling, and the
C++ Standard Library.
As I can see, it only refers to C++, and only supports C where this happens to be in the intersection of C and C++. So better think of it as C++ with extensions for the device part rather than C. That avoids you a lot of headaches if you are used to C.

What is NVIDIA's plan?
I believe the general trend is that CUDA and OpenCL are regarded as too low level techniques for many applications. Right now, Nvidia is investing heavily into OpenACC which could roughly be described as OpenMP for GPUs. It follows a declarative approach and tackles the problem of GPU parallelization at a much higher level. So that is my totally subjective impression of what Nvidia's plan is.

Why is llvm considered unsuitable for implementing a JIT?

Many dynamic languages implement (or want to implement) a JIT Compiler in order to speed up their execution times. Inevitably, someone from the peanut gallery asks why they don't use LLVM. The answer is often, "LLVM is unsuitable for building a JIT." (For Example, Armin Rigo's comment here.)
Why is LLVM Unsuitable for building a JIT?
Note: I know LLVM has its own JIT. If LLVM used to be unsuitable, but now is suitable, please say what changed. I'm not talking about running LLVM Bytecode on the LLVM JIT, I'm talking about using the LLVM libraries to implement a JIT for a dynamic language.

Why is LLVM Unsuitable for building a JIT?
I wrote HLVM, a high-level virtual machine with a rich static type system including value types, tail call elimination, generic printing, C FFI and POSIX threads with support for both static and JIT compilation. In particular, HLVM offers incredible performance for a high-level VM. I even implemented an ML-like interactive front-end with variant types and pattern matching using the JIT compiler, as seen in this computer algebra demonstration. All of my HLVM-related work combined totals just a few weeks work (and I am not a computer scientist, just a dabbler).
I think the results speak for themselves and demonstrate unequivocally that LLVM is perfectly suitable for JIT compilation.

There are some notes about LLVM in the Unladen Swallow post-mortem blog post:
http://qinsb.blogspot.com/2011/03/unladen-swallow-retrospective.html .
Unfortunately, LLVM in its current state is really designed as a static compiler optimizer and back end. LLVM code generation and optimization is good but expensive. The optimizations are all designed to work on IR generated by static C-like languages. Most of the important optimizations for optimizing Python require high-level knowledge of how the program executed on previous iterations, and LLVM didn't help us do that.

There is a presentation on using LLVM as a JIT backened where the address many of the concerns raised as to why its bad, most of its seems to boil down to people building a static compiler as a JIT instead of building an actual JIT.

It takes a long time to start up is the biggest complaint - however, this is not so much of an issue if you did what Java does and start up in interpreter mode, and use LLVM to compile the most used parts of the program.
Also while there are arguments like this scattered all over the internet, Mono has been using LLVM as a JIT compiler successfully for a while now (though it's worth noting that it defaults to their own faster but less efficient backend, and they also modified parts of LLVM).
For dynamic languages, LLVM might not be the right tool, just because it was designed for optimizing system programming languages like C and C++ which are strongly/statically typed and support very low level features. In general the optimizations performed on C don't really make dynamic languages fast, because you're just creating an efficient way of running a slow system. Modern dynamic language JITs do things like inlining functions that are only known at runtime, or optimizing based on what type a variable has most of the time, which LLVM is not designed for.

Update: as of 7/2014, LLVM has added a feature called "Patch Points", which are used to support Polymorphic Inline Caches in Safari's FTL JavaScript JIT. This covers exactly the use case complained about int Armin Rigo's comment in the original question.

For a more detailed rant about the LLVM IR see here: LLVM IR is a compiler IR.

Developing embedded software library, C or C++?

I'm in the process of developing a software library to be used for embedded systems like an ARM chip or a TI DSP (for mostly embedded systems, but it would also be nice if it could also be used in a PC environment). Obviously this is a pretty broad range of target systems, so being able to easily port to different systems is a priority.The library will be used for interfacing with a specific hardware and running some algorithms.
I am thinking C++ is the best option, over C, because it is much easier to maintain and read. I think the additional overhead is worth it for being able to work in the object oriented paradigm. If I was writing for a very specific system, I would work in C but this is not the case.
I'm assuming that these days most compilers for popular embedded systems can handle C++. Is this correct?
Is there any other factors I should consider? Is my line of thinking correct?

If portability is very important for you, especially on an embedded system, then C is certainly a better option than C++. While C++ compilers on embedded platforms are catching up, there's simply no match for the widespread use of C, for which any self-respecting platform has a compliant compiler.
Moreover, I don't think C is inferior to C++ where it comes to interfacing hardware. The amount of abstraction is sufficiently low (i.e. no deep class hierarchies) to make C just as good an option.

There is certainly good support of C++ for ARM. ARM have their own compiler and g++ can also generate EABI compliant ARM code. When it comes to the DSPs, you will have to look at their toolchain to decide what you are going to do. Be aware that the library that comes with a DSP may well not implement the full C or C++ standard library.
C++ is suitable for low-level embedded development and is used in the SymbianOS Kernel. Having said that, you should keep things as simple as possible.
Avoid exceptions which may demand more library support than what is present (therefore use new (std::nothrow) Foo instead of new Foo).
Avoid memory allocations as much as possible and do them as early as possible.
Avoid complex patterns.
Be aware that templates can bloat your code.

I have seen many complaints that C++ is "bloated" and inappropriate for embedded systems.
However, in an interview with Stroustrup and Sutter, Bjarne Stroustrup mentioned that he'd seen heavily templated C++ code going into (IIRC) the braking systems of BMWs, as well as in missile guidance systems for fighter aircraft.
What I take away from this is that experts of the language can generate sophisticated, efficient code in C++ that is most certainly suitable for embedded systems. However, a "C With Classes"[1] programmer that does not know the language inside out will generate bloated code that is inappropriate.
The question boils down to, as always: in which language can your team deliver the best product?
[1] I know that sounds somewhat derogatory, but let me say that I know an awful lot of these guys, and they churn out an awful lot of relatively simple code that gets the job done.

C++ compilers for embedded platforms are much closer to 83's C with classes than 98's C++ standard, let alone C++0x. For instance, some platform we use still compile with a special version of gcc made from gcc-2.95!
This means that your library interface will not be able to provide interfaces with containers/iterators, streams, or such advanced C++ features. You'll have to stick with simple C++ classes, that can very easily be expressed as a C interface with a pointer to a structure as first parameter.
This also means that within your library, you won't be able to use templates to their full power. If you want portability, you will still be restricted to generic containers use of templates, which is, I'm sure you'll admit, only a very tiny part of C++ templates power.

C++ has little or no overhead compared to C if used properly in an embedded environment. C++ has many advantages for information hiding, OO, etc. If your embedded processor is supported by gcc in C then chances are it will also be supported with C++.

On the PC, C++ isn't a problem at all -- high quality compilers are extremely widespread and almost every C compiler is directly associated with a C++ compiler that's quite good, though there are a few exceptions such as lcc and the newly revived pcc.
Larger embedded systems like those based on the ARM are generally quite similar to desktop systems in terms of tool chain availability. In fact, many of the same tools available for desktop machines can also generate code to run on ARM-based machines (e.g., lots of them use ports of gcc/g++). There's less variety for TI DSPs (and a greater emphasis on quality of generated code than source code features), but there are still at least a couple of respectable C++ compilers available.
If you want to work with smaller embedded systems, the situation changes in a hurry. If you want to be able to target something like a PIC or an AVR, C++ isn't really much of an option. In theory, you could get (for example) Comeau to produce a custom port that generated code you could compile on that target's C compiler -- but chances are pretty good that even if you did, it wouldn't work out very well. These systems are really just too limitated (especially on memory size) for C++ to fit them well.

Depending on what your intended use is for the library, I think I'd suggest implementing it first as C - but the design should keep in mind how it would be incorporated into a C++ design. Then implement C++ classes on top of and/or along side of the C implementation (there's no reason this step cannot be done concurrently with the first). If your C design is done with a C++ design in mind, it's likely to be as clean, readable and maintainable as the C++ design would be. This is somewhat more work, but I think you'll end up with a library that's useful in more situations.
While you'll find C++ used more and more on various embedded projects, there are still many that restrict themselves to C (and I'd guess this is more often the case than not) - regardless of whether or not the tools support C++. It would be a shame to have a nice library of routines that you could bring to a new project you're working on, but be unable to use them because C++ isn't being used on that particular project.
In general, it's much easier to use a well-designed C library from C++ than the other way around. I've taken this approach with several sets of code including parsing Intel Hex files, a simple command parser, manipulating synchronization objects, FSM frameworks, etc. I'm planning on doing a simple XML parser at some point.

Here's an entirely different C++-vs-C argument: stable ABIs. If your library exports a C ABI, it can be compiled with any compiler that works on the system, because C ABIs are generally platform standards. If your library exports a C++ ABI, it can only be compiled with a matching compiler -- because C++ ABIs are usually not platform standards, and often differ from compiler to compiler and even version to version.
Interestingly, one of the rare exceptions to this is ARM; there's an ARM C++ ABI specification, and all compliant ARM compilers follow it. This is not true on x86; on x86, you're lucky if a C++ library compiled with a 4.1 version of GCC will link correctly with an application compiled with GCC 4.4, and don't even ask about 3.4.6.
Even if you export a C ABI, you can have problems. If your library uses C++ internally, it will then link to libstdc++ for things in the C++ std:: namespace. If your user compiles a C++ application that uses your library, they'll also link to libstdc++ -- and so the overall application gets linked to libstdc++ twice, and their libstdc++ may not be compatible with your libstdc++, which can (or so I understand) lead to odd errors from the intersection of the two. Considerably less likely, but still possible.
All of these arguments only apply because you're writing a library, and they're not showstoppers. But they are things to be aware of.

Using C++ in an embedded environment

Today I got into a very interesting conversation with a coworker, of which one subject got me thinking and googling this evening. Using C++ (as opposed to C) in an embedded environment. Looking around, there seems to be some good trades for and against the features C++ provides, but others Meyers clearly support it. So, I was wondering who would be able to shed some light on this topic and what the general consensus of the community was.

C++ for embedded platforms is perfectly fine - as long as you treat it as a better C. I love the fact that the language is slightly more structured. You can still do all the things that you want to do with C. Just remember to stick to an embedded C library like Newlib or uClibc.
I particularly like the abstraction that we can build using C++, particularly for I/O devices. So, we can have a class for UART and a class for GPIO and what nots. It is cleaner than having a bunch of functions (IMHO).

The fear of C++ among embedded developers is largely a thing of the past, when C++ compilers were not as good as C compilers (optimizations and code quality wise).
This applies especially to modern platforms with 32 bit architectures.
But, C is certainly still the preferred choice for more confined environments (as is assembler for 8 bit or 4 bit targets).
So, it really boils down to the resources your target platform provides, and how much of these resources you are likely to actually require, i.e. if you can afford the 'luxury' of doing embedded development in C++ (or even Java for that matter), because you know that you'll hardly have any issues regarding memory or CPU constraints.
Nowadays, many modern embedded platforms (think gaming consoles, mobile phones, PDAs etc), have really become very capable targets, with RISC architectures, several MB of RAM, and 3D hardware acceleration.
It would be a poor decision, to program such platforms using just C or even assembler out of uninformed performance considerations, on the other hand programming a 16 bit PIC in C++ would probably also be a controversial decision.
So, it's really a matter of asking yourself how much of the power, you'll actually need and how much you can afford to sacrifice, in order to improve the development experience (high level language, faster development, less tedious/redundant tasks).

It sort of depends on the particular nature of your embedded system and which features of C++ you use. The language itself doesn't necessarily generate bulkier code than C.
For example, if memory is your tightest constraint, you can just use C++ like "C with classes" -- that is, only using direct member functions, disabling RTTI, and not having any virtual functions or templates. That will fit in pretty much the same space as the equivalent C code, since you've no type information, vtables, or redundant functions to clutter things up.
I've found that templates are the biggest thing to avoid when memory is really tight, since you get one copy of each template function for each type it's specialized on, and that can rapidly bloat code segment.
In the console video games industry (which is sort of the beefy end of the embedded world) C++ is king. Our constraints are hard limits on memory (512mb on current generation) and realtime performance. Generally virtual functions and templates are used, but not exceptions, since they bloat the stack and are too perf-costly. In fact, one major manufacturer's compiler doesn't even support exceptions at all.

In my previous company all embedded code was written in a small subset of C code due to security (SIL-2) and memory reasons. By introducing a richer language like C++ in that particular scenario would have maybe cause more trouble than benefits.
In all due respect to C++ (which is a language I really love) but I think C - in our particular scenario - was the better choice.
I bet in some cases C++ is just fine to use for embedded applications but it really depends on the application - there is a difference if your program is controlling a nuclear plant or administrating an address book on your cell phone.

I don't know about "general consensus", only the company I work for (which does a lot of development for mobile phones, car navigation systems, DPFs, etc.).
The main drawback I've encountered to using C++ on embedded platforms as opposed to C is that it isn't quite as portable - there are many more cases of compilers that don't adhere to the standard which can cause problems if you need to build your code with more than 1 compiler or outright have bugs in the implementation. Then there are environments where C++ code simply won't run - BREW's issues with relocatable code and its "native OOP" don't play so well with "regular" C++ classes and inheritance.
In the end, though, if you're only targeting 1 platform, I'd say use whatever you think is "better" (faster, less bugs, better design) for your development - in most cases the issues can be worked around quite easily.

Depends what kind of embedded development you are doing. I've done embedded development with both C++, C, and Assembly on various platforms, you can even use Java to write applications on smart phones.
For instance on a smart phone like device that's running Windows CE 5, almost all of the code is C++, including in the operating system. Only small bits are written in C or assembly.
On the other hand I've written code for an MSP430 microcontroller, which was in C, and I probably would have done that in C++ had the compiler been more reliable and standards compliant.
Also I seem to recall a university lecturer of mine talking about writing embedded code in Forth or something. So really any language can do.

Now a days it will all boil down to the C++ runtime support of the platform. You're likely to find a way to compile C++ code down to almost any embedded platform with GCC, but if you can't find a suitable C++ runtime for the platform your efforts will be futile, unless you write your own C++ runtime.

One of the few things I tend to agree with Linus is his opinion about C++ http://thread.gmane.org/gmane.comp.version-control.git/57643/focus=57918
Besides this, if you really really want to use C++ you might want to have a look at http://www.caravan.net/ec2plus/ which describes Embedded C++, or better to say you should not use in C++ for embedded systems.

The big thing keeping us with using C++ for a long time was the VxWorks support for it, which truly sucked. That supposidly has gotten better on VxWorks 6 (yes, it's been out a while... good 'ole vendor lock-in and lack of company vision has kept us stuck on VxWorks 5.5).
So for us it's mostly a question of the environment. After that, C++ can obviously be just as good as C... it's a matter of people understanding what their tool does and how to use it. C++ may make it easier to write incredibly inefficient code, but that doesn't mean we have to succomb to it.

I am currently fighting a problem with exceptions in an embedded Linux application. We are trying to port software written for a different platform that seemed to support exceptions well, but the new compiler tools (a port of gcc) reports errors when creating the eh_frame. I was against using exceptions for this tool, but the developer reassured me that modern compilers would support it well.
My opinion is that there are some advantages to C++, but I would stay away from exceptions and the standard template library. We haven't had problems using virtual functions.

C++ is suitable for microcontrollers and devices without an OS. You just have to know the architecture of the system and be conscious of time and space constrains, especially when doing mission critical programming.
With C++ you can do abstraction which often leads to an increased footprint in the code. You do not want this when programming for a resource-limited machine such as an 8-bit MCU.
Generally, avoid:
Dynamic memory allocation because it represents uncertainty in timing
Overloading
RTTI because the memory cost is large
Exceptions because of the execution speed lowering
Be cautious with virtual functions as they have a resource cost of a vtable per class and one pointer to the vtable per object. Also, use const in place of #define.
As you move up to 16 and 32-bit MCUs, with 10s or 100s of MB RAM, heavier features like the ones mentioned above may be used.
So to round up, C++ is useful for embedded systems. A main benefit is that OOP can be useful when you want to abstract aspects of the microcontroller, for example UART or state machines. But you may want to avoid certain features all of the time and some of the features some of the time, depending on the target you are programming for.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js