Where to find information on Embedded C++?

Where to find information on Embedded C++? - c++

I want to find information on "C++ programming in an embedded platfrom".
I googled it but I was unable to find sufficient information on that topic. What exactly I want to find is How exactly C++ is useful in an embedded environment with detailed description and examples (if they are available)
Can anyone please suggest any links or any free ebook downloads if I can get ?

I can also recommend the book Embedded C by Michael J.Pont. and Programming Embedded Systems by Michael Barr.
Under my 14 years as an embedded developer I learned that "praxis" often don't work in embedded systems. When the book says use this or that pattern, that is true if you have unlimited memory and CPU power.
I had to break almost every rule about design when I designed the new FW platform for a major company last year. You need to ask yourself if a well known and accepted solution is the best for your project at the cost of code size or speed? Things to keep in mind.
Local or global?
Think twice before you declare a variable. Local variables are created on the stack in runtime while global variables are created once when you boot up the system.
Const is stored in flash, taking up space and have the same access time as indexed arrays. Better to use type casted defines if you don't need to refer to them with pointers:
#define kState_Idle (unsigned char)4
This will compile in the 4 in the assembly code instead of fetching it from flash as an read only variable.
Do not use double or float, those are dead slow. Use integer math instead. avoid the math library at all cost :)
local variables accessed within loops (such as for, while etc..) are slowing down things, declare them as register variables to gain speed.
Use section to place code
The C/C++ framework are copying all variables (including constants) to RAM. Big waste of space if it is read only variables. Strings goes into this category as well, such as "Hello world".
When it comes to C++, templates are no-no big time, so are also RTTI and exceptions. Avoid it!!
Overloading and morphing will get you really far with good planning, your code will be compact and fast.
Libraries
Depending on the microcontrollers size you would probably avoid to include any STL. We made our own versions of get(), put() printf() etc to keep down code size.
Use hardware
Don't forget to study your microcontroler/CPU to utilize the hardware to 100% For example instead of filling memory with memset or memcpy, use DMA if you have one.
Study the assembly as well. It is very often that the controller have specialized instructions that would take several lines of C/C++ code to do. You can write your own C functions in assembly to connect them into your C/C++ code. Very good examples are bit set/clear instructions or block manipulation instructions.
Check what data size the controller are using. For example if it is a 16 bit system, it likely reads 16 bits all the time, even if you have declared a char. In this case it takes longer time to read a char than a short because it has to do an extra masking.
Memory
Remember that internal RAM is much faster than external RAM. You can place variables or even code in internal RAM to speed up.
Flash are generally slower than RAM especially to write. However placing read only variables that are accessed often is usually no back draw. The compiler usually detect a frequently used variable and assigns an internal register.
testing
It is often not possible to send debug information up to the host system fast enough without impacting performance. In these cases, create an internal debug buffer to store your information and analyse it afterwards.
Measure your execution time by toggling a hardware pin, it takes one assembly instruction and has virtual no impact on the execution speed. Monitor the pin with an logic analyser or oscilloscope. We hunted ns in often used functions to boost overall performance.
Autogenerated code documentation is also a good way to find "strange" design or solutions. We used Doxygen with Graphviz to generate class diagrams and relations. Here we got a good overview and could esily spot "obsolete" classes or non updated sub systems (we used the Agile development method)
Huh.. I can go on forever and write a whole book about this :)
We made a printer printing 150mm/s on 20k RAM (RTOS, variables, communication buffers and heap) and 64k Flash (boot block, application code and 2 flash disks) all internal using the above recommendations in C++.
Good luck!

I would recommend reading books related to embedded C, for example Embedded C by Michael J.Pont, 2002 or Programming Embedded Systems in C and C++ by Michael Barr, 1999 (http://book.opensourceproject.org.cn/embedded/embeddedc/).
In a nutshell, all embedded systems are started with C/assembler. C++ can be used also, but usage is not far from "non-embedded C++" (the only difference usually is that lots of heavy features like RTTI, exceptions, etc are deprecated).
BTW, it might be more useful to look at the your embedded platform/OS related books/guides/examples.

Not really a valid question.
Embedded can mean anything from a micro-ATX board running Windows7 with a full windows C++ app to a single chip uC which might support C with "//" comments.
Occasionally embedded platforms lack exceptions and perhaps RTTI - otherwise pretty much standard C++.

Related

How to make a GameBoy / GameBoy Advance Emulator? [duplicate]

Closed. This question is off-topic. It is not currently accepting answers.
Closed 9 years ago.
Locked. This question and its answers are locked because the question is off-topic but has historical significance. It is not currently accepting new answers or interactions.
How do emulators work? When I see NES/SNES or C64 emulators, it astounds me.
Do you have to emulate the processor of those machines by interpreting its particular assembly instructions? What else goes into it? How are they typically designed?
Can you give any advice for someone interested in writing an emulator (particularly a game system)?

Emulation is a multi-faceted area. Here are the basic ideas and functional components. I'm going to break it into pieces and then fill in the details via edits. Many of the things I'm going to describe will require knowledge of the inner workings of processors -- assembly knowledge is necessary. If I'm a bit too vague on certain things, please ask questions so I can continue to improve this answer.
Basic idea:
Emulation works by handling the behavior of the processor and the individual components. You build each individual piece of the system and then connect the pieces much like wires do in hardware.
Processor emulation:
There are three ways of handling processor emulation:
Interpretation
Dynamic recompilation
Static recompilation
With all of these paths, you have the same overall goal: execute a piece of code to modify processor state and interact with 'hardware'. Processor state is a conglomeration of the processor registers, interrupt handlers, etc for a given processor target. For the 6502, you'd have a number of 8-bit integers representing registers: A, X, Y, P, and S; you'd also have a 16-bit PC register.
With interpretation, you start at the IP (instruction pointer -- also called PC, program counter) and read the instruction from memory. Your code parses this instruction and uses this information to alter processor state as specified by your processor. The core problem with interpretation is that it's very slow; each time you handle a given instruction, you have to decode it and perform the requisite operation.
With dynamic recompilation, you iterate over the code much like interpretation, but instead of just executing opcodes, you build up a list of operations. Once you reach a branch instruction, you compile this list of operations to machine code for your host platform, then you cache this compiled code and execute it. Then when you hit a given instruction group again, you only have to execute the code from the cache. (BTW, most people don't actually make a list of instructions but compile them to machine code on the fly -- this makes it more difficult to optimize, but that's out of the scope of this answer, unless enough people are interested)
With static recompilation, you do the same as in dynamic recompilation, but you follow branches. You end up building a chunk of code that represents all of the code in the program, which can then be executed with no further interference. This would be a great mechanism if it weren't for the following problems:
Code that isn't in the program to begin with (e.g. compressed, encrypted, generated/modified at runtime, etc) won't be recompiled, so it won't run
It's been proven that finding all the code in a given binary is equivalent to the Halting problem
These combine to make static recompilation completely infeasible in 99% of cases. For more information, Michael Steil has done some great research into static recompilation -- the best I've seen.
The other side to processor emulation is the way in which you interact with hardware. This really has two sides:
Processor timing
Interrupt handling
Processor timing:
Certain platforms -- especially older consoles like the NES, SNES, etc -- require your emulator to have strict timing to be completely compatible. With the NES, you have the PPU (pixel processing unit) which requires that the CPU put pixels into its memory at precise moments. If you use interpretation, you can easily count cycles and emulate proper timing; with dynamic/static recompilation, things are a /lot/ more complex.
Interrupt handling:
Interrupts are the primary mechanism that the CPU communicates with hardware. Generally, your hardware components will tell the CPU what interrupts it cares about. This is pretty straightforward -- when your code throws a given interrupt, you look at the interrupt handler table and call the proper callback.
Hardware emulation:
There are two sides to emulating a given hardware device:
Emulating the functionality of the device
Emulating the actual device interfaces
Take the case of a hard-drive. The functionality is emulated by creating the backing storage, read/write/format routines, etc. This part is generally very straightforward.
The actual interface of the device is a bit more complex. This is generally some combination of memory mapped registers (e.g. parts of memory that the device watches for changes to do signaling) and interrupts. For a hard-drive, you may have a memory mapped area where you place read commands, writes, etc, then read this data back.
I'd go into more detail, but there are a million ways you can go with it. If you have any specific questions here, feel free to ask and I'll add the info.
Resources:
I think I've given a pretty good intro here, but there are a ton of additional areas. I'm more than happy to help with any questions; I've been very vague in most of this simply due to the immense complexity.
Obligatory Wikipedia links:
Emulator
Dynamic recompilation
General emulation resources:
Zophar -- This is where I got my start with emulation, first downloading emulators and eventually plundering their immense archives of documentation. This is the absolute best resource you can possibly have.
NGEmu -- Not many direct resources, but their forums are unbeatable.
RomHacking.net -- The documents section contains resources regarding machine architecture for popular consoles
Emulator projects to reference:
IronBabel -- This is an emulation platform for .NET, written in Nemerle and recompiles code to C# on the fly. Disclaimer: This is my project, so pardon the shameless plug.
BSnes -- An awesome SNES emulator with the goal of cycle-perfect accuracy.
MAME -- The arcade emulator. Great reference.
6502asm.com -- This is a JavaScript 6502 emulator with a cool little forum.
dynarec'd 6502asm -- This is a little hack I did over a day or two. I took the existing emulator from 6502asm.com and changed it to dynamically recompile the code to JavaScript for massive speed increases.
Processor recompilation references:
The research into static recompilation done by Michael Steil (referenced above) culminated in this paper and you can find source and such here.
Addendum:
It's been well over a year since this answer was submitted and with all the attention it's been getting, I figured it's time to update some things.
Perhaps the most exciting thing in emulation right now is libcpu, started by the aforementioned Michael Steil. It's a library intended to support a large number of CPU cores, which use LLVM for recompilation (static and dynamic!). It's got huge potential, and I think it'll do great things for emulation.
emu-docs has also been brought to my attention, which houses a great repository of system documentation, which is very useful for emulation purposes. I haven't spent much time there, but it looks like they have a lot of great resources.
I'm glad this post has been helpful, and I'm hoping I can get off my arse and finish up my book on the subject by the end of the year/early next year.

A guy named Victor Moya del Barrio wrote his thesis on this topic. A lot of good information on 152 pages. You can download the PDF here.
If you don't want to register with scribd, you can google for the PDF title, "Study of the techniques for emulation programming". There are a couple of different sources for the PDF.

Emulation may seem daunting but is actually quite easier than simulating.
Any processor typically has a well-written specification that describes states, interactions, etc.
If you did not care about performance at all, then you could easily emulate most older processors using very elegant object oriented programs. For example, an X86 processor would need something to maintain the state of registers (easy), something to maintain the state of memory (easy), and something that would take each incoming command and apply it to the current state of the machine. If you really wanted accuracy, you would also emulate memory translations, caching, etc., but that is doable.
In fact, many microchip and CPU manufacturers test programs against an emulator of the chip and then against the chip itself, which helps them find out if there are issues in the specifications of the chip, or in the actual implementation of the chip in hardware. For example, it is possible to write a chip specification that would result in deadlocks, and when a deadline occurs in the hardware it's important to see if it could be reproduced in the specification since that indicates a greater problem than something in the chip implementation.
Of course, emulators for video games usually care about performance so they don't use naive implementations, and they also include code that interfaces with the host system's OS, for example to use drawing and sound.
Considering the very slow performance of old video games (NES/SNES, etc.), emulation is quite easy on modern systems. In fact, it's even more amazing that you could just download a set of every SNES game ever or any Atari 2600 game ever, considering that when these systems were popular having free access to every cartridge would have been a dream come true.

I know that this question is a bit old, but I would like to add something to the discussion. Most of the answers here center around emulators interpreting the machine instructions of the systems they emulate.
However, there is a very well-known exception to this called "UltraHLE" (WIKIpedia article). UltraHLE, one of the most famous emulators ever created, emulated commercial Nintendo 64 games (with decent performance on home computers) at a time when it was widely considered impossible to do so. As a matter of fact, Nintendo was still producing new titles for the Nintendo 64 when UltraHLE was created!
For the first time, I saw articles about emulators in print magazines where before, I had only seen them discussed on the web.
The concept of UltraHLE was to make possible the impossible by emulating C library calls instead of machine level calls.

Something worth taking a look at is Imran Nazar's attempt at writing a Gameboy emulator in JavaScript.

Having created my own emulator of the BBC Microcomputer of the 80s (type VBeeb into Google), there are a number of things to know.
You're not emulating the real thing as such, that would be a replica. Instead, you're emulating State. A good example is a calculator, the real thing has buttons, screen, case etc. But to emulate a calculator you only need to emulate whether buttons are up or down, which segments of LCD are on, etc. Basically, a set of numbers representing all the possible combinations of things that can change in a calculator.
You only need the interface of the emulator to appear and behave like the real thing. The more convincing this is the closer the emulation is. What goes on behind the scenes can be anything you like. But, for ease of writing an emulator, there is a mental mapping that happens between the real system, i.e. chips, displays, keyboards, circuit boards, and the abstract computer code.
To emulate a computer system, it's easiest to break it up into smaller chunks and emulate those chunks individually. Then string the whole lot together for the finished product. Much like a set of black boxes with inputs and outputs, which lends itself beautifully to object oriented programming. You can further subdivide these chunks to make life easier.
Practically speaking, you're generally looking to write for speed and fidelity of emulation. This is because software on the target system will (may) run more slowly than the original hardware on the source system. That may constrain the choice of programming language, compilers, target system etc.
Further to that you have to circumscribe what you're prepared to emulate, for example its not necessary to emulate the voltage state of transistors in a microprocessor, but its probably necessary to emulate the state of the register set of the microprocessor.
Generally speaking the smaller the level of detail of emulation, the more fidelity you'll get to the original system.
Finally, information for older systems may be incomplete or non-existent. So getting hold of original equipment is essential, or at least prising apart another good emulator that someone else has written!

Yes, you have to interpret the whole binary machine code mess "by hand". Not only that, most of the time you also have to simulate some exotic hardware that doesn't have an equivalent on the target machine.
The simple approach is to interpret the instructions one-by-one. That works well, but it's slow. A faster approach is recompilation - translating the source machine code to target machine code. This is more complicated, as most instructions will not map one-to-one. Instead you will have to make elaborate work-arounds that involve additional code. But in the end it's much faster. Most modern emulators do this.

When you develop an emulator you are interpreting the processor assembly that the system is working on (Z80, 8080, PS CPU, etc.).
You also need to emulate all peripherals that the system has (video output, controller).
You should start writing emulators for the simpe systems like the good old Game Boy (that use a Z80 processor, am I not not mistaking) OR for C64.

Emulator are very hard to create since there are many hacks (as in unusual
effects), timing issues, etc that you need to simulate.
For an example of this, see http://queue.acm.org/detail.cfm?id=1755886.
That will also show you why you ‘need’ a multi-GHz CPU for emulating a 1MHz one.

Also check out Darek Mihocka's Emulators.com for great advice on instruction-level optimization for JITs, and many other goodies on building efficient emulators.

I've never done anything so fancy as to emulate a game console but I did take a course once where the assignment was to write an emulator for the machine described in Andrew Tanenbaums Structured Computer Organization. That was fun an gave me a lot of aha moments. You might want to pick that book up before diving in to writing a real emulator.

Advice on emulating a real system or your own thing?
I can say that emulators work by emulating the ENTIRE hardware. Maybe not down to the circuit (as moving bits around like the HW would do. Moving the byte is the end result so copying the byte is fine). Emulator are very hard to create since there are many hacks (as in unusual effects), timing issues, etc that you need to simulate. If one (input) piece is wrong the entire system can do down or at best have a bug/glitch.

The Shared Source Device Emulator contains buildable source code to a PocketPC/Smartphone emulator (Requires Visual Studio, runs on Windows). I worked on V1 and V2 of the binary release.
It tackles many emulation issues:
- efficient address translation from guest virtual to guest physical to host virtual
- JIT compilation of guest code
- simulation of peripheral devices such as network adapters, touchscreen and audio
- UI integration, for host keyboard and mouse
- save/restore of state, for simulation of resume from low-power mode

To add the answer provided by #Cody Brocious
In the context of virtualization where you are emulating a new system(CPU , I/O etc ) to a virtual machine we can see the following categories of emulators.
Interpretation: bochs is an example of interpreter , it is a x86 PC emulator,it takes each instruction from guest system translates it in another set of instruction( of the host ISA) to produce the intended effect.Yes it is very slow , it doesn't cache anything so every instruction goes through the same cycle.
Dynamic emalator: Qemu is a dynamic emulator. It does on the fly translation of guest instruction also caches results.The best part is that executes as many instructions as possible directly on the host system so that emulation is faster. Also as mentioned by Cody, it divides the code into blocks ( 1 single flow of execution).
Static emulator: As far I know there are no static emulator that can be helpful in virtualization.

How I would start emulation.
1.Get books based around low level programming, you'll need it for the "pretend" operating system of the Nintendo...game boy...
2.Get books on emulation specifically, and maybe os development. (you won't be making an os, but the closest to it.
3.look at some open source emulators, especially ones of the system you want to make an emulator for.
4.copy snippets of the more complex code into your IDE/compliler. This will save you writing out long code. This is what I do for os development, use a district of linux

I wrote an article about emulating the Chip-8 system in JavaScript.
It's a great place to start as the system isn't very complicated, but you still learn how opcodes, the stack, registers, etc work.
I will be writing a longer guide soon for the NES.

In game programming are global variables bad?

I know my gut reaction to global variables is "badd!" but in the two game development courses I've taken at my college globals were used extensively, and now in the DirectX 9 game programming tutorial I am using (www.directxtutorial.com) I'm being told globals are okay in game programming ...? The site also recommends using only structs if you can when doing game programming to help keep things simple.
I'm really confused on this issue, and all the research I've been trying to do is very confusing. I realize there are issues when using global variables (threading issues, they make code harder to maintain, the state of them is hard to track etc) but also there is a cost associated with not using globals, I'd have to pass a loooot of information around very often which can be confusing and I imagine time-costing, although I guess pointers would speed the process up (this is my first time writing a game in C++.) Anyway, I realize there is probably no "right" or "wrong" answer here since both ways work, but I want my code to be as proper as I can so any input would be good, thank you very much!

The trouble with games and globals is that games (nowadays) are threaded at engine level. Game developers using an engine use the engine's abstractions rather than directly programming concurrency (IIRC). In many of the highlevel languages such as C++, threads sharing state is complex. When many concurrent processes share a common resource they have to make sure they don't tread on eachother's toes.
To get around this, you use concurrency control such as mutex and various locks. This in effect makes asynchronous critical sections of code access shared state in a synchronous manner for writing. The topic of concurrency control is too much to explain fully here.
Suffice to say, if threads run with global variables, it makes debugging very hard, as concurrency bugs are a nightmare (think, "which thread wrote that? Who holds that lock?").
There are exceptions in games programming API such as OpenGL and DX. If your shared data/globals are pointers to DX or OpenGL graphics contexts then typically this maps down to GPU operations which don't suffer so much from the same trouble.
Just be careful. Keeping objects representing 'player' or 'zombie' or whatever, and sharing them between threads can be tricky. Spawn 'player' threads and 'zombie group' threads instead and have a robust concurrency abstraction between them based on message passing rather than accessing those object's state across the thread/critical section boundary.
Saying all that, I do agree with the "Say no to globals" point made below.
For more on the complexities of threads and shared state see:
1 POSIX Threads API - I know it is POSIX, but provides a good idea that translates to other API
2 Wikipedia's excellent range of articles on concurrency control mechanisms
3 The Dining Philosopher's problem (and many others)
4 ThreadMentor tutorials and articles on threading
5 Another Intel article, but more of a marketing thing.
6 An ACM article on building multi-threaded game engines

Have worked on AAA game titles, I can tell you that globals should be eradicated immediately before they spread like a cancer. I've seen them corrupt an I/O subsystem so completely that it had to be wholly thrown out to be rewritten.
Say no to globals. Always.

In this respect, there's no difference between games and other programs. While arguably OK in small examples given in elementary courses, global variables are strongly discouraged in real programs.
So if you want to write correct, readable and maintainable code, stay away from global variables as much as possible.

All answers until now deal with the globals/threads issue, but I will add my 2 cents to the struct/class (understood as all public attributes/private attributes + methods) discussion. Preferring structs over classes on the grounds of that being simpler is in the same line of thought of preferring assembler over C++ on the grounds of that being a simpler language.
The fact that you have to think on how your entities are going to be used and provide methods for it makes the concrete entity a little more complex, but greatly simplifies the rest of the code and maintainability. The whole point of encapsulation is that it simplifies the program by providing clear ways in that your data can be modified while maintaining your objects invariants. You control the entry points and what can happen there. Having all attributes public imply that any part of the code can have a small innocent error (forgot to check condition X) and break your invariants completely (health below 0, but no 'death' processing being triggered)
The other common discussion is performance: If I just need to update a datum, then having to call a method will impact my performance. Not really. If methods are simple and you provide them in the header as inlines (inside the class body or outside with the inline keyword), the compiler will be able to copy those instructions to each use place. You get the guarantee that the compiler will not leave out any check by mistake, and no impact in performance.

Having read a bit more what you posted though:
I'm being told globals are okay in
game programming ...? The site also
recommends using only structs if you
can when doing game programming to
help keep things simple.
Games code is no different from other code really. Gratuitous use of globals is bad regardless. And as for 'only use structs', that is just a load of crap. Approach game development on the same principles as any other software - you may find places where you need to bend this but they should be the exception, typically when dealing with low-level hardware issues.

I would say that the advice about globals and 'keeping things simple' is probably a mixture of being easier to learn and old fashioned thinking about performance. When I was being taught about game programming I remember being told that C++ wasn't advised for games as it would be too slow but I've worked on multiple games using every facet of C++ which proves that isn't true.
I would add to everyone's answers here that globals are to be avoided where possible, I wouldn't be afraid to use whatever you need from C++ to make your code understandable and easy to use. If you come up against a performance problem then profile that specific issue and I'll bet that most of the time you won't need to remove use of a C++ feature but just think about your problem better. There may still be some platforms around that require pure C but I don't really have experience of them, even the Gameboy Advance seemed to deal with C++ quite nicely.

The metaissue here is that of state. How much state does any given function depend on, and how much does it change? Then consider how much state is implicit versus explicit, and cross that with the inspect vs. change.
If you have a function/method/whatever that is called DoStuff(), you have no idea from the outside what it depends on, what it needs, and what's going to happen to the shared state. If this is a class member, you also have no idea how that object's state is going to mutate. This is bad.
Contrast to something like cosf(2), this function is understood not to change any global state, and any state that it requires (lookup tables for example) are hidden from view and have no effect on your program-at-large. This is a function that computes a value based on what you give it and it returns that value. It changes no state.
Class member functions then have the opportunity to step up some problems. There's a huge difference between
myObject.hitpoints -= 4;
myObject.UpdateHealth();
and
myObject.TakeDamage(4);
In the first example, an external operation is changing some state that one of its member functions implicitly depends upon. The fact is that these two lines can be separated by many other lines of code begins to make it non-obvious what's going to happen in the UpdateHealth call, even if outside of the subtraction it is the same as the TakeDamage call. Encapsulating the state changes (in the second example) implies that the specifics of the state changes aren't important to the outside world, and hopefully they're not. In the first example, the state changes are explicitly important to the outside world, and this is really no different than setting some globals and calling a function that uses those globals. E.g. hopefully you'd never see
extern float value_to_sqrt;
value_to_sqrt = 2.0f;
sqrt(); // reads the global value_to_sqrt
extern float sqrt_value; // has the results of the sqrt.
And yet, how many people do exactly this sort of thing in other contexts? (Considering especially that class instance state is "global" in regards to that particular instance.)
So- prefer giving explicit instruction to your function calls, and prefer that they return the results directly rather than having to explicitly set state before calling a function and then checking other state after it returns.
The more state dependencies a bit of code has, the harder it'll be to make it multithread safe, but that has already been covered above. The point I want to make is that the problem isn't so much globals but more the visibility of the collection of state that is required for a bit of code to operate (and subsequently how much other code also depends on that state).

Most games aren't multi-threaded, although newer titles are going down that route, and so they've managed to get away with it so far.
Saying that globals are okay in games is like not bothering to fix the brakes on your car because you only drive at 10mph!
It's bad practice which ever way you look at it.
You only have to look at the number of bugs in games to see examples of this.

If in class A you need to access data D, instead of setting D global, you'd better put into A a reference to D.

Globals are NOT intrinsically bad. In C for instance they are part of the language's normal use... and since C++ builds on C they still have a place.
On an aesthetic level, it's better to avoid them where you can sensibly make them part of a class, but if all you do is wrap a bunch of globals into a singleton, you made things worse because at least with globals it's obvious what the point is.
Be careful, but for some things it makes less sense to force OO concepts on what is actually a global value.

Two specific issues that I've encountered in the past:
First: If you're attempting to separate e.g. render phase (const access to most game state) from logic phase (non-const access to most game state) for whatever reason (decoupling render rate from logic rate, synchronizing game state across a network, recording and playback of gameplay at a fixed point in the frame, etc), globals make it very hard to enforce that.
Problems tend to creep in and become hard to debug and eradicate. This also has implications for threaded renderers separate from game logic, or the like (the other answers cover this topic thoroughly).
Second: The presence of many globals tends to bloat the literal pool, which the compiler typically places after each function.
If you get to your state through either a single "struct GlobalTable" which holds globals or a collection of methods on an object or the like, your literal pool tends to be a lot smaller, decreasing the size of the .text section in your executable.
This is mostly a concern for instruction set architectures that can't embed load targets directly into instructions (see e.g. fixed-width ARM or Thumb version 1 instruction encoding on ARM processors). Even on modern processors I'd wager you'll get slightly smaller codegen.
It also hurts doubly when your instruction and data caches are separate (again, as on some ARM processors); you'll tend to get two cachelines loaded where you might only need one with a unified cache. Since the literal pool may count as data, and won't always start on a cacheline boundary, the same cacheline might have to be loaded both into the i-cache and d-cache.
This may count as a "microoptimization", but it's a transform that's relatively easily applied to an entire codebase (e.g. extern struct GlobalTable { /* ... */ } the; and then replace all g_foo with the.foo)... And we've seen code size decrease between 5 and 10 percent in some cases, with a corresponding boost to performance due to keeping the caches cleaner.

I'm going to take a risk and say it depends to me on the scope/scale of your project. Because if you are trying to program an epic game with a boatload of code, then globals could, and easily in the most painful way in hindsight, cost you way more time than they save.
But if you are trying to code, say, something as simple as Super Mario or Metroid or Contra or even simpler, like Pac-Man, then these games were originally coded in 6502 assembly and used globals for almost everything. Just about all the game data and state was stored in data segments, and that didn't stop the devs from shipping a very competent and popular product in spite of working with absolutely inferior tools and engineering standards which would probably horrify people today.
So if you are just writing this kind of small and simple game which has a very limited scope and isn't designed to grow and expand far beyond its original design, isn't designed to be maintained for years and years, with a few thousand lines of simple C++ code, then I don't see the big deal of using a global here or a singleton there. Someone obsessed with trying to engineer Super Mario with the soundest engineering techniques with SOLID and a DI framework could end up taking far, far longer to ship than even the devs who wrote it in 6502 asm.
And I'm getting old and there's something to it there when I look at these old simple games and how they were coded, and it almost seems like the devs were doing something right in spite of the hard-coded magic numbers and globals all over the place while I spend my career fumbling around and trying to figure out the best way to engineer things. That said this is probably a very unpopular opinion, and not one I would have liked either a decade or two ago, but there's something to it. I don't look at the 6502 asm of Metroid and think, "these devs underengineered their product and their lives would have been so much easier if they did this or that." Seems like they did things just about right.
But again this is for small-scale stuff, maybe in the indie category of games by today's standards, and in the smaller of the indie games among them, and far from doing anything ground-breaking in terms of how much data it can process or using cutting-edge hardware techniques. If in doubt, I'd definitely suggest to err on the side of avoiding globals. It's also a little bit trickier in C++ as opposed to say, C, since you can have objects with constructor and destructors, and initialization and destruction order isn't well-defined and easily predictable for global objects. There I'd say to lean even more on the side of avoiding globals since they can trip you up in whole new ways when you aren't explicitly initializing and destroying them yourself in a predictable order. And naturally if you want to multithread a lot beyond a critical loop here and there, then your ability to reason about thread-safety of any particular code will be severely diminished if you cannot minimize the scope/visibility of your game state to the minimum of places.

Programming in gamedev (performance related)

I am just wondering how some things work in gamedev:
I know, that the performance is actually crucial so there is still (and I think never will be) no place to use managed languages/platforms as Java/.NET just because of their performance. But... recently I have read somewhere here on SO, that even though people creating games use C++ as a primary language, they actually do not use STL or Boost (or a lot of them). In think it has something in common with performance, right? If I am wrong, could you please tell me what are the reasons to avoid those libraries (that I think make developer's life much easier)? Is it because of licensing (Boost)? And what about EA's version of STL? Do other studios make their own versions too?
How "close to metal" game programming really is? Do you go deeper and closer to the machine? Do you sometimes use Assembly for critical inner loops, or C++ is actually the lowest abstraction layer that you use at the moment? I assume that in such products where performance is the most important thing profiling is very, very common task - but are you sometimes forced to use assembly to speed some parts up, or good C++ is "good enough"?
Edit:
Sorry, It may not have been clear, but I am interested in answers from people having game industry experience. I am not interested in some assumptions given by people who do not have commercial experience in game development. I am also not interested in examples of some niche-games created in C#/Java whatever. However if you know a product that looks better than FarCry2 (just and example, but your favourite modern great looking game name here), and is written entirely in Java/.NET, and has similar performance to FarCry2... do not hesitate to mention this product! Thanks.

1.
Contrary to some beliefs, the STL is quite optimized and not at all bad code. The reason for why most game studios don't use it is memory. You don't have as much control over memory allocation and deallocation as if you would write your own version of the STL containers. This is also the reason why managed languages are not preferred.
Writing your own containers will let you write cross platform code and do memory tracking easier. This is especially true on consoles where, for instance the PS3, requires detailed knowledge of the hardware in order to get the best performance out of it. Which usually in the end means that you need full control over memory flow between the PPU, SPUs and RSX.
2.
Assembler is only "required" (in quotes since it's not actually required but helps) for very specialized operations, e.g. math library functions. What's more common is SIMD intrinsics which vectorizes the code. However, most studios have legacy code which is quite optimized and since these optimizations are quite low level it's not code that needs to change greatly between hardware generations. I'd say on consoles it's much more common that you use lower level code.

I know, that the performance is actually crucial so there is still (and I think never will be) no place to use managed languages/platforms as Java/.NET just because of their performance.
No, you don't know this. You think it, you want to believe it because you romanticize game development, and because you think high-level languages can't be fast.
.NET performance is perfectly good enough for 90% of the games out there. And it's only going to get better. There is no inherent reason why managed platforms must be slower. They have the potential to be faster because they're JIT'ed. In practice, their performance tends to be about the same as reasonably good C++ code, much better than typical C++ code, and slightly worse than really good C++ code. And most big games use more than one language anyway. They use scripting languages, like Lua or Python, or some home-brewed stuff, all of which are orders of magnitudes slower than .NET.
Similarly, there is absolutely no reason why most of your game couldn't be written in .NET. And then the three really performance-critical functions can, if necessary, be ported to native C++ later.
But... recently I have read somewhere here on SO, that even though people creating games use C++ as a primary language, they actually do not use STL or Boost (or a lot of them). In think it has something in common with performance, right? If I am wrong, could you please tell me what are the reasons to avoid those libraries
Same as you're guilty of above... Superstition about game development. "Oh no, we can't afford to use other people's code! It's far too inefficient". Game development is stuck in the 80's in terms of programming practices and methodologies. In other words, don't worry too much about what other game developers do. If the STL or Boost make your code easier to write, then use it! And then, if you experience performance problems, profile it, and if necessary, replace that particular library component with your own.
But most of the STL is literally zero overhead. And 95% of the code in any game is not performance critical. Treat game development like you would any other programming. Don't treat it as some magical land where every line of code must be perfectly optimized and where normal rules don't apply.
And what about EA's version of STL? Do other studios make their own versions too?
As far as I know, no. EA made it partly for internal use, but also as input to the C++ community as a whole, as an example of what they (and a lot of game developers) would like to see from future revisions to the standard (it was submitted to the standards committee as well)

I can recommend the book C++ for game programmers. It has an in-depth discussion of the performance cost of c++ features such as the STL, exceptions and RTTI. It also touches on having your own memory manager, and various common performance optimizations.
Appearlenty there is a new edition out, but it has a different author. What's up with that?

About 1:
I haven't tried EA's STL version, but I can confirm from my own game development experience that the STL can sometimes be a bottleneck. So far I was always able to find workarounds though.
Boost can be very helpful, but it really depends on the particular part of boost whether it's helpful or not for performance-critical code. For example, Boost::filesystem was very useful for me, whereas boost::signals was barely useable due to very poor performance. So I implemented my own signaling library using FastDelegates instead.
About 2:
Most of the time you will get away with regular C++ code. Once the game is running and you can identify bottlenecks with your profiler, you can start optimizing those pieces of code. And even then, you might not have to write any assembler code if you do it right.
For example, my custom-built 2d game engine runs without hardware acceleration. I developed it when 3D drivers were still quite buggy and most casual gamers have outdated graphics card drivers, so compatibility was more important than pure performance at that time.
Still, in our game latest game Gemsweeper, we are using a lot of alpha blending with 8-bit alpha masks and the game still has to run on 500 mhz cpu's. So alpha blending turned out to be a performance critical area.
To optimize this, I've used the VectorC compiler so that I could take advantage of MMX, SSE and the like without having to write assembler code. But the same code can be fast on one CPU and slow on the other (e.g. Intel vs. AMD), so I also compiled the alpha blending code several times with different optimization settings. The game runs a benchmark at runtime to find the fastest blitting module for each blitting method and uses that module from then on.
I've compared the result with some other 2d blending libraries, one of which claimed to be pure hand-optimized assembler and my engine had about the same performance in average, as measured on different CPUs.
Bottom line:
Do not optimize without measuring first and think about alternatives before starting to write assembler. This usually wields good enough results and wil save you a lot of time.

We use 'STL' (ie. the standard C++ library) and a small amount of Boost. However some of it is avoided or frowned upon (std::map, std::list, boost::shared_ptr) typically for the over-exuberant memory allocation policies or poor cache coherency. These can typically be worked around, eg. with custom allocators, but instead we have other approaches to the same problems with their own benefits.
As for how close to the metal it really is, it depends. In our project C++ is the lowest level we go. In another project in this studio, there is a little assembly, especially on the non-PC platforms. These days in certain projects you're just as likely to be limited by the GPU as by the CPU so the days of low level code optimisation are getting fewer and the days of optimising shaders and art assets are growing.
Be wary of claiming that Java/.NET etc is never used however. Not everybody needs the performance of FarCry2 (which is a pretty excessive spec) which is why you're seeing more and more games written in managed languages with C++ just for optimisation.

It is true that in game development, STL is not used. In spite of what certain people always rush to claim, they also never use Java or C# or other managed languages.
I'm not talking about small X-Box Live Arcade downloadable games or web browser games, or such things. I'm talking about high-end development in AAA games.
They don't use STL. However, they do use their own custom implementations that look a lot like STL. There will be smart-arrays, there will be hash tables, there will be smart pointers, they just won't be STL.
Consoles have some performance characteristics that are very different from PCs. Even game projects that exclusively target PC are usually using codebases that have been used for console projects in the past. A lot of tweaking goes into making the basic template structures work as desired.
Most game studios also want code that they can adapt to other platforms. Locking into an implementation from MS/Sony/Nintendo makes for a lot more pain when it comes time to port the game to a new platform. The provided template libraries (which aren't necessarily STL to start with) are often less than stellar. At least they are that way early in the hardware cycle when a studio is ironing out the engine they plan to keep using for the next five years.
At the studios I've worked at, I've certainly seen a fair degree of "not-built-here" attitude to dismiss third party code. Sometimes it's justified, sometimes it's not. In the case of basic data structure templates, it typically is.
As for your second question, assembler is occasionally used. But only in isolated situations where a large volume of math needs to happen very frequently. An entire engine might contain two or three smallish files of asm blocks.

You can find out for yourself (to a degree) by looking at game SDKs.
Almost all the id Tech 4 games (DooM III, Prey, Quake IV, ET:QW) have SDKs out, complete with physics, script, AI, math, etc. systems included. The only asm used is for specialized math code, everything else is pure C++.
Crytek has a Crysis SDK out (you'll need the game installed to install the SDK though) and Far Cry SDK.
Valve has the Source SDK available to anyone who has purchased a Source game through steam.
There are a lot more if you look. A lot of the code isn't particularly clean or flexible (sometimes not even fast), but I suppose it's easier to adjust things in code you've written as opposed to monolithic libraries full of hard to understand template-fu.

No, you are largely wrong. Both .NET and Java have been used in commercial games, certainly on Windows (and probably on consoles too).
STL is also used widely, I know that quite a large proportion of amateur games developers use it.
Probably the main reason for not using STL is inertia, and using third party libraries/engines which do not.
I imagine that historically on some platforms, good STL implementations were no available, especially on RAM-limited stuff like PS2.

Is there any reason to use C instead of C++ for embedded development? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 1 year ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Original close reason(s) were not resolved
Improve this question
Question
I have two compilers on my hardware C++ and C89
I'm thinking about using C++ with classes but without polymorphism (to avoid vtables).
The main reasons I’d like to use C++ are:
I prefer to use “inline” functions instead of macro definitions.
I’d like to use namespaces as I prefixes clutter the code.
I see C++ a bit type safer mainly because of templates, and verbose casting.
I really like overloaded functions and constructors (used for automatic casting).
Do you see any reason to stick with C89 when developing for very limited hardware (4kb of RAM)?
Conclusion
Thank you for your answers, they were really helpful!
I thought the subject through and I will stick with C mainly because:
It is easier to predict actual code in C and this is really important if you have only 4kb of ram.
My team consists mainly of C developers, so advanced C++ features won't be frequently used.
I've found a way to inline functions in my C compiler (C89).
It is hard to accept one answer as you provided so many good answers.
Unfortunately I can't create a wiki and accept it, so I will choose one answer that made me think most.

For a very resource constrained target such as 4KB of RAM, I'd test the waters with some samples before committing a lot of effort that can't be easily ported back into a pure ANSI C implementation.
The Embedded C++ working group did propose a standard subset of the language and a standard subset of the standard library to go with it. I lost track of that effort when the C User's Journal died, unfortunately. It looks like there is an article at Wikipedia, and that the committee still exists.
In an embedded environment, you really have to be careful about memory allocation. To enforce that care, you may need to define the global operator new() and its friends to something that can't be even linked so that you know it isn't used. Placement new on the other hand is likely to be your friend, when used judiciously along with a stable, thread-safe, and latency guaranteed allocation scheme.
Inlined functions won't cause much problem, unless they are big enough that they should have been true functions in the first place. Of course the macros their replacing had that same issue.
Templates, too, may not cause a problem unless their instantiation runs amok. For any template you do use, audit your generated code (the link map may have sufficient clues) to make certain that only the instantiations you intended to use happened.
One other issue that may arise is compatibility with your debugger. It isn't unusual for an otherwise usable hardware debugger to have very limited support for interaction with the original source code. If you effectively must debug in assembly, then the interesting name mangling of C++ can add extra confusion to the task.
RTTI, dynamic casts, multiple inheritance, heavy polymorphism, and exceptions all come with some amount of runtime cost for their use. A few of those features level that cost over the whole program if they are used, others just increase the weight of classes that need them. Know the difference, and choose advanced features wisely with full knowledge of at least a cursory cost/benefit analysis.
In an small embedded environment you will either be linking directly to a real time kernel or running directly on the hardware. Either way, you will need to make certain that your runtime startup code handles C++ specific startup chores correctly. This might be as simple as making sure to use the right linker options, but since it is common to have direct control over the source to the power on reset entry point, you might need to audit that to make certain that it does everything. For example, on a ColdFire platform I worked on, the dev tools shipped with a CRT0.S module that had the C++ initializers present but comment out. If I had used it straight from the box, I would have been mystified by global objects whose constructors had never run at all.
Also, in an embedded environment, it is often necessary to initialize hardware devices before they can be used, and if there is no OS and no boot loader, then it is your code that does that. You will need to remember that constructors for global objects are run before main() is called so you will need to modify your local CRT0.S (or its equivalent) to get that hardware initialization done before the global constructors themselves are called. Obviously, the top of main() is way too late.

Two reasons for using C over C++:
For a lot of embedded processors, either there is no C++ compiler, or you have to pay extra for it.
My experience is that a signficant proportion of embedded software engineers have little or no experience of C++ -- either because of (1), or because it tends not to be taught on electronic engineeering degrees -- and so it would be better to stick with what they know.
Also, the original question, and a number of comments, mention the 4 Kb of RAM. For a typical embedded processor, the amount of RAM is (mostly) unrelated to the code size, as the code is stored, and run from, flash.
Certainly, the amount of code storage space is something to bear in mind, but as new, more capacious, processors appear on the market, it's less of an issue than it used to be for all but the most cost-sensitive projects.
On the use of a subset of C++ for use with embedded systems: there is now a MISRA C++ standard, which may be worth a look.
EDIT: See also this question, which led to a debate about C vs C++ for embedded systems.

No. Any of the C++ language features that could cause problems (runtime polymorphism, RTTI, etc.) can be avoided while doing embedded development. There is a community of embedded C++ developers (I remember reading columns by embedded developers using C++ in the old C/C++ Users' Journal), and I can't imagine they'd be very vocal if the choice was that bad.

The Technical Report on C++ Performance is a great guide for this sort of thing. Note that it has a section on embedded programming concerns!
Also, ++ on the mention of Embedded C++ in the answers. The standard is not 100% to my tastes, but it is a good bit of reference when deciding what parts of C++ you might drop.
While programming for small platforms, we disable exceptions and RTTI, avoided virtual inheritance, and paid close attention to the number of virtual functions we have lying around.
Your friend is the linker map, though: check it frequently, and you'll spot sources of code and static memory bloat quickly.
After that, the standard dynamic memory usage considerations apply: in an environment as restricted as the one you mention, you may want to not use dynamic allocations at all. Sometimes you can get away with memory pools for small dynamic allocs, or "frame-based" allocation where you preallocate a block and throw out the whole thing later.

I recommend using the C++ compiler, but limiting your use of C++ specific features. You can program like C in C++ (the C runtime is included when doing C++, though in most embedded applications you don't make use of the standard library anyway).
You can go ahead and use C++ classes etc., just
Limit your use of virtual functions (as you've said)
Limit your use of templates
For an embedded platform, you'll want to override the operator new and/or use placement new for memory allocation.

As a firmware/embedded system engineer, I can tell you guys some of the reason why C is still the #1 choice over C++ and yes, I'm fluent in both of them.
1) Some targets we develop on has 64kB of RAM for both code and data, so you have to make sure every byte count, and yes, I've dealt with code optimization to save 4 bytes that cost me 2 hours, and that's in 2008.
2) Every C library function is reviewed before we let them in the final code, because of size limitation, so we prefer people not to use divide (no hardware divider, so a big library is needed), malloc (because we have no heap, all memory is allocated from data buffer in 512 byte chunk and must be code reviewed), or other object oriented practice that carry large penalty. Remember, every library function that you use count.
3) Ever heard of the term overlay? you have so little code space that sometimes you have to swap things out with another set of code. If you call a library function then the library function must be resident. If you only use it in an overlay function, you are wasting a lot of space relying on too many object oriented methods. So, don't assume any C library function, let alone C++ to be accepted.
4) Casting and even packing (where unaligned data structure crosses word boundary) are needed due to limited hardware design (i.e. an ECC engine that is wired a certain way) or to cope with a hardware bug. You cannot assume too much inplicitly, so why object orient it too much?
5) Worst case scenario: eliminating some of the object oriented methods will force develop to think before they use resources that can explode (i.e. allocating 512bytes on a stack rather than from a data buffer), and prevent some of the potential worst case scenario that are not tested for or eliminate the whole code path all together.
6) We do use a lot of abstraction to keep hardware from software and make code as portable as possible, and simulation friendly. Hardware access must be wrapped in a macro or inline function that are conditionally compiled between different platform, data type must be casted as byte size rather than target specific, direct pointer usage is not allowed (because some platform assume memory mapped I/O is the same as data memory), etc.
I can think of more, but you get the idea. Us firmware guys do have object oriented training, but the task of embedded system can be so hardware oriented and low level, that it is not high level or abstractable by nature.
BTW, every firmware job I've been at uses source control, I don't know where you get that idea from.
-some firmware guy from SanDisk.

I have heard that some people prefer C for embedded work due to the fact that is simpler and therefore easier to predict the actual code that will be generated.
I personally would think writing C-style C++ (using templates for type-safety) would give you a lot of advantages though and I can't see any real reason not to.

My personal preference is C because :
I know what every line of code is doing (and costs)
I don't know C++ well enough to know what every line of code is doing (and costs)
Why do people say this? You don't know what every line of C is doing unless you check the asm output. Same goes for C++.
For example, what asm does this innocent statement produce:
a[i] = b[j] * c[k];
It looks fairly innocent, but a gcc based compiler produces this asm for an 8-bit micro
CLRF 0x1f, ACCESS
RLCF 0xfdb, W, ACCESS
ANDLW 0xfe
RLCF 0x1f, F, ACCESS
MOVWF 0x1e, ACCESS
MOVLW 0xf9
MOVF 0xfdb, W, ACCESS
ADDWF 0x1e, W, ACCESS
MOVWF 0xfe9, ACCESS
MOVLW 0xfa
MOVF 0xfdb, W, ACCESS
ADDWFC 0x1f, W, ACCESS
MOVWF 0xfea, ACCESS
MOVFF 0xfee, 0x1c
NOP
MOVFF 0xfef, 0x1d
NOP
MOVLW 0x1
CLRF 0x1b, ACCESS
RLCF 0xfdb, W, ACCESS
ANDLW 0xfe
RLCF 0x1b, F, ACCESS
MOVWF 0x1a, ACCESS
MOVLW 0xfb
MOVF 0xfdb, W, ACCESS
ADDWF 0x1a, W, ACCESS
MOVWF 0xfe9, ACCESS
MOVLW 0xfc
MOVF 0xfdb, W, ACCESS
ADDWFC 0x1b, W, ACCESS
MOVWF 0xfea, ACCESS
MOVFF 0xfee, 0x18
NOP
MOVFF 0xfef, 0x19
NOP
MOVFF 0x18, 0x8
NOP
MOVFF 0x19, 0x9
NOP
MOVFF 0x1c, 0xd
NOP
MOVFF 0x1d, 0xe
NOP
CALL 0x2142, 0
NOP
MOVFF 0x6, 0x16
NOP
MOVFF 0x7, 0x17
NOP
CLRF 0x15, ACCESS
RLCF 0xfdf, W, ACCESS
ANDLW 0xfe
RLCF 0x15, F, ACCESS
MOVWF 0x14, ACCESS
MOVLW 0xfd
MOVF 0xfdb, W, ACCESS
ADDWF 0x14, W, ACCESS
MOVWF 0xfe9, ACCESS
MOVLW 0xfe
MOVF 0xfdb, W, ACCESS
ADDWFC 0x15, W, ACCESS
MOVWF 0xfea, ACCESS
MOVFF 0x16, 0xfee
NOP
MOVFF 0x17, 0xfed
NOP
The number of instructions produced depends massively on:
The sizes of a, b, and c.
whether those pointers are stored on the stack or are global
whether i, j and k are on the stack or are global
This is especially true in the tiny embedded world, where processors are just not set up to handle C. So my answer would be that C and C++ are just as bad as each other, unless you always examine the asm output, in which case they are just as good as each other.
Hugo

I see no reason to use C instead of C++. Whatever you can do in C, you can do it also in C++. If you want to avoid overheads of VMT, don't use virtual methods and polymorphism.
However, C++ can provide some very useful idioms with no overhead. One of my favourites is RAII. Classes are not necessary expensive in terms of memory or performance...

The human mind deals with complexity by evaluating as much as possible, and then deciding what's important to focus on, and discarding or depreciating the rest. This is the entire underpinning behind branding in Marketing, and largely, icons.
To combat this tendency I prefer C to C++, because it forces you to think about your code, and how it's interacting with the hardware more closely - relentlessly close.
From long experience it is my belief that C forces you to come up with better solutions to problems, in part, by getting out of your way and not forcing you to waste lots of time satisfying a constraint some compiler-writer thought was a good idea, or figuring out what's going on "under the covers".
In that vein, low level languages like C have you spending a lot of time focused on the hardware and building good data-structure/algorithm bundles, while high level languages have you spending a lot of time scratching your head wondering what's going on in there, and why you can't do something perfectly reasonable in your specific context and environment. Beating your compiler into submission (strong typing is the worst offender) is NOT a productive use of time.
I probably fit the programmer mold well - I like control. In my view, that's not a personality flaw for a programmer. Control is what we get paid to do. More specifically, FLAWLESSLY control. C gives you much more control than C++.

I've written some code for ARM7 embedded paltform on IAR Workbench. I highly recommend relying on templates to do compile-time optimization and path prediction. Avoid dynamic casting like plague. Use traits/policies to your advantage, as prescribed in Andrei Alexandrescu's book, Modern C++ design.
I know, it can be hard to learn, but I am also sure that your product will benefit from this approach.

For a system constrained to 4K of ram, I would use C, not C++, just so that you can be sure to see everything that's going on. The thing with C++, is that it's very easy to use far more resources (both CPU and memory) than it looks like glancing at the code. (Oh, I'll just create another BlerfObject to do that...whoops! out of memory!)
You can do it in C++, as already mentioned (no RTTI, no vtables, etc, etc), but you'll spend as much time making sure your C++ usage doesn't get away from you as you would doing the equivalent in C.

A good reason and sometimes the only reason is that there is still no C++ compiler for the specific embedded system. This is the case for example for Microchip PIC micro-controllers. They are very easy to write for and they have a free C compiler (actually, a slight variant of C) but there is no C++ compiler in sight.

Personally with 4kb of memory I'd say you are not getting that much more mileage out of C++, so just pick the one that seems the best compiler/runtime combination for the job, since language is probably not going to matter much.
Note that it is also not all about language anyway, since also the library matters. Often C libs have a slightly smaller minimum size, but I could imagine that a C++ lib targeted at embedded development is cut down, so be sure to test.

C wins on portability - because it is less ambiguous in language spec; therefore offering much better portability and flexibility across different compilers etc (less headaches).
If you aren't going to leverage C++ features to meet a need then go with C.

Do you see any reason to stick with C89 when developing for very limited
hardware (4kb of RAM)?
Personally, when it comes to embedded applications (When I say embedded, I don't mean winCE, iPhone, etc.. bloated embedded devices today). I mean resource limited devices.
I prefer C, though I have worked with C++ quite a bit as well.
For example, the device you're talking about has 4kb of RAM, well just for that reason I wouldn't consider C++. Sure, you may be able to design something small using C++ and limit your usage of it in your application like other posts have suggested but C++ "could" potentially end up complicating/bloating your application under the covers.
Are you going to link statically? You may want to compare static a dummy application using c++ vs c. That may lead you to consider C instead. On the other hand if you are able to build a C++ application within your memory requirements, go for it.
IMHO,
In general, in embedded applications I like to know everything that is going on. Who's using memory/system resources, how much and why? When do they free them up?
When developing for a target with X amount of resources, cpu, memory, etc.. I try to stay on the lower side of using those resources because you never know what future requirements will come along thus having you add more code to the project that was "supposed" to be a simple small application but ends up becoming a lot bigger.

My choice is usually determined by the C library we decide to use, which is selected based on what the device needs to do. So, 9/10 times .. it ends up being uclibc or newlib and C. The kernel we use is a big influence on this too, or if we're writing our own kernel.
Its also a choice of common ground. Most good C programmers have no problem using C++ (even though many complain the entire time that they use it) .. but I have not found the reverse to be true (in my experience).
On a project we're working on (that involves a ground up kernel), most things are done in C, however a small network stack was implemented in C++, because it was just easier and less problematic to implement networking using C++.
The end result is, the device will either work and pass acceptance tests or it won't. If you can implement foo in xx stack and yy heap constraints using language z, go for it, use whatever makes you more productive.
My personal preference is C because :
I know what every line of code is doing (and costs)
I don't know C++ well enough to know what every line of code is doing (and costs)
Yes, I am comfortable with C++, but I don't know it as well as I do standard C.
Now if you can say the reverse of that, well, use what you know :) If it works, passes tests, etc .. what's the problem?

How much ROM/FLASH do you have?
4kB of RAM can still mean there are hundreds of kilobytes of FLASH to store the actual code and static data. RAM on this size tends to be meant just for variables, and if you are careful with those you can fit quite a large program in terms of code lines into memory.
However, C++ tends to make putting code and data in FLASH more difficult, due to the run-time construction rules for objects. In C, a constant struct can easily be put into FLASH memory and accessed as a hardware-constant object. In C++, a constant object would require the compiler to evaluate the constructor at compile-time, which I think is still beyond what a C++ compiler can do (theoretically, you could do it, but it is very very hard to do in practice).
So in a "small RAM", "large FLASH" kind of environment I would go with C any day. Note that a good intermediate choice i C99 which has most of the nice C++ features for non-class-based-code.

Just want to say that there is no system with "UNLIMITED" resources. Everything in this world is limited and EVERY application should consider resource usage no matter whether its ASM, C, JAVA or JavaScript. The dummies that allocate a few Mbs "just to be sure" makes iPhone 7, Pixel and other devices extremely luggy. No matter whether you have 4kb or 40 Gb.
But from another side to oppose resource wasting - is a time that takes to save those resources. If it takes 1 week extra to write a simple thing in C to save a few ticks and a few bytes instead of using C++ already implemented, tested and distributed. Why bother? Its like buying a usb hub. yes you can make it yourself but is it going to be better? more reliable? cheaper if you count your time?
Just a side thought - even power from your outlet is not unlimited. Try to research where its coming from and you will see mostly its from burning something. The law of energy and material is still valid: no material or energy appears or disappears but rather transforms.

In general no. C++ is a super set of C. This would be especially true for for new projects.
You are on the right track in avoiding C++ constructs that can be expensive in terms of cpu time and memory foot print.
Note that some things like polymorphism can be very valuable - the are essentially function pointers. If you find you need them, use them - wisely.
Also, good (well designed) exception handling can make your embedded app more reliable than an app that handles things with traditional error codes.

For memory allocation issue, I can recommend using Quantum Platform and its state machine approach, as it allocates everything you'd need at the initialization time. It also helps to alleviate contention problems.
This product runs on both C and C++.

Some say that C compilers can generate much more efficient code because they don't have to support the advanced C++ features and can therefore be more aggressive in their optimizations.
Of course, in this case you may want to put the two specific compilers to the test.

The only reason to prefer C IMHO would be if the C++ compiler for your platform is not in a good shape (buggy, poor optimization, etc).

You have inline in C99. Maybe you like ctors, but the business of getting dtors right can be messy. If the remaining only reason to not use C is namespaces, I would really stick to C89. This is because you might want to port it to a slightly different embedded platform. You may later start writing in C++ on that same code. But beware the following, where C++ is NOT a superset of C. I know you said you have a C89 compiler, but does this C++ comparison with C99 anyway, as the first item for example is true for any C since K&R.
sizeof 'a' > 1 in C, not in C++.
In C you have VLA variable length arrays. Example: func(int i){int a[i].
In C you have VAM variable array members. Example: struct{int b;int m[];}.

It depends on the compiler.
Not all embedded compilers implement all of C++, and even if they do, they might not be good at avoiding code bloat (which is always a risk with templates). Test it with a few smaller programs, see if you run into any problems.
But given a good compiler, no, there's no reason not to use C++.

I recommend C++ with limitations and notes.
Time to market and maintainability. C++ development is easier and faster. So if you are in the designing phase, chose a controller great enough to use C++. (Note, that some high volume market requires as low cost as possible, where you cannot make this choice.)
Speed. C can be faster than C++, but be sure this the speed gain is not big. So you can go with C++. Develop your algorithms, test them, and make them faster only if required(!). Use profilers, to point the bottlenecks and rewrite them in extern "C" way, to achieve C speed. (If it still slow implement that part in ASM)
Binary size. The C++ codes are bigger, but here is a great answer that tells the details. The size of the compiled binary of a given C code will be the same whether it was compiled using the C or the C++ compiler. "The executable size is hardly related to the language, but to the libraries you include in your project." Go with C++ but avoid advanced functionalities, like streams, string, new, virtual functions, etc. Review all library function before letting them in the final code, because of size limitation (based on this answer)

Different answer post to a different aspect of the question:
"malloc"
Some previous replies talk quite a bit about this. Why do you even think that call exists? For a truly small platform, malloc tends to be unavailable, or definitely optional. Implementing dynamic memory allocation tends to be meaningful when you get to have an RTOS in the bottom of your system -- but until then, it is purely dangerous.
You can get very far without it. Just think about all the old FORTRAN programs which did not even have a proper stack for local variables...

There are a number of different controller manufacturers around the globe and when you take a look into their designs and the instruction sets that need to be used to configure you may end up in a lot of troubles. The main disadvantage of assembly language is that machine/architecture dependent. It’s really huge ask for a developer to by heart all the instructions set out there to accomplish coding for different controllers. That is why C became more popular in embedded development because C is high-level enough to abstract the algorithms and data structures from hardware-dependent details, making the source code portable across a wide variety of target hardware, architecture independent language and very easy to convert and maintain the code. But we do see some high-level languages (object-oriented) like C, C++, Python, Java etc. are evolving enough to make them under the radar of embedded system development.

On such a limited system. Just go for Assembler. Gives you total control over every aspect, and gives no overhead.
Probably a lot faster too since a lot of embedded compilers are not the best optimizers (especially if one compares it to state of the art compilers like the ones we have for the desktop (intel, visual studio, etc))
"yeah yeah...but c is re-usable and...". On such a limited system, chances are you won't re-use much of that code on a different system anyway. On the same system, assembler is just as re-usable.

What can C++ do that is too hard or messy in any other language?

I still feel C++ offers some things that can't be beaten. It's not my intention to start a flame war here, please, if you have strong opinions about not liking C++ don't vent them here. I'm interested in hearing from C++ gurus about why they stick with it.
I'm particularly interested in aspects of C++ that are little known, or underutilised.

RAII / deterministic finalization. No, garbage collection is not just as good when you're dealing with a scarce, shared resource.
Unfettered access to OS APIs.

I have stayed with C++ as it is still the highest performing general purpose language for applications that need to combine efficiency and complexity. As an example, I write real time surface modelling software for hand-held devices for the surveying industry. Given the limited resources, Java, C#, etc... just don't provide the necessary performance characteristics, whereas lower level languages like C are much slower to develop in given the weaker abstraction characteristics. The range of levels of abstraction available to a C++ developer is huge, at one extreme I can be overloading arithmetic operators such that I can say something like MaterialVolume = DesignSurface - GroundSurface while at the same time running a number of different heaps to manage the memory most efficiently for my app on a specific device. Combine this with a wealth of freely available source for solving pretty much any common problem, and you have one heck of a powerful development language.
Is C++ still the optimal development solution for most problems in most domains? Probably not, though at a pinch it can still be used for most of them. Is it still the best solution for efficient development of high performance applications? IMHO without a doubt.

Shooting oneself in the foot.
No other language offers such a creative array of tools. Pointers, multiple inheritance, templates, operator overloading and a preprocessor.
A wonderfully powerful language that also provides abundant opportunities for foot shooting.
Edit: I apologize if my lame attempt at humor has offended some. I consider C++ to be the most powerful language that I have ever used -- with abilities to code at the assembly language level when desired, and at a high level of abstraction when desired. C++ has been my primary language since the early '90s.
My answer was based on years of experience of shooting myself in the foot. At least C++ allows me to do so elegantly.

Deterministic object destruction leads to some magnificent design patterns. For instance, while RAII is not as general a technique as garbage collection, it leads to some impressive capabilities which you cannot get with GC.
C++ is also unique in that it has a Turing-complete preprocessor. This allows you to prefer (as in the opposite of defer) a lot of code tasks to compile time instead of run time. For instance, in real code you might have an assert() statement to test for a never-happen. The reality is that it will sooner or later happen... and happen at 3:00am when you're on vacation. The C++ preprocessor assert does the same test at compile time. Compile-time asserts fail between 8:00am and 5:00pm while you're sitting in front of the computer watching the code build; run-time asserts fail at 3:00am when you're asleep in Hawai'i. It's pretty easy to see the win there.
In most languages, strategy patterns are done at run-time and throw exceptions in the event of a type mismatch. In C++, strategies can be done at compile-time through the preprocessor facility and can be guaranteed typesafe.

Write inline assembly (MMX, SSE, etc.).
Deterministic object destruction. I.e. real destructors. Makes managing scarce resources easier. Allows for RAII.
Easier access to structured binary data. It's easier to cast a memory region as a struct than to parse it and copy each value into a struct.
Multiple inheritance. Not everything can be done with interfaces. Sometimes you want to inherit actual functionality too.

I think i'm just going to praise C++ for its ability to use templates to catch expressions and execute it lazily when it's needed. For those not knowing what this is about, here is an example.

Template mixins provide reuse that I haven't seen elsewhere. With them you can build up a large object with lots of behaviour as though you had written the whole thing by hand. But all these small aspects of its functionality can be reused, it's particularly great for implementing parts of an interface (or the whole thing), where you are implementing a number of interfaces. The resulting object is lightning-fast because it's all inlined.
Speed may not matter in many cases, but when you're writing component software, and users may combine components in unthought-of complicated ways to do things, the speed of inlining and C++ seems to allow much more complex structures to be created.

Absolute control over the memory layout, alignment, and access when you need it. If you're careful enough you can write some very cache-friendly programs. For multi-processor programs, you can also eliminate a lot of slow downs from cache coherence mechanisms.
(Okay, you can do this in C, assembly, and probably Fortran too. But C++ lets you write the rest of your program at a higher level.)

This will probably not be a popular answer, but I think what sets C++ apart are its compile-time capabilities, e.g. templates and #define. You can do all sorts of text manipulation on your program using these features, much of which has been abandoned in later languages in the name of simplicity. To me that's way more important than any low-level bit fiddling that's supposedly easier or faster in C++.
C#, for instance, doesn't have a real macro facility. You can't #include another file directly into the source, or use #define to manipulate the program as text. Think about any time you had to mechanically type repetitive code and you knew there was a better way. You may even have written a program to generate code for you. Well, the C++ preprocessor automates all of these things.
The "generics" facility in C# is similarly limited compared to C++ templates. C++ lets you apply the dot operator to a template type T blindly, calling (for example) methods that may not exist, and checks-for-correctness are only applied once the template is actually applied to a specific class. When that happens, if all the assumptions you made about T actually hold, then your code will compile. C# doesn't allow this... type "T" basically has to be dealt with as an Object, i.e. using only the lowest common denominator of operations available to everything (assignment, GetHashCode(), Equals()).
C# has done away with the preprocessor, and real generics, in the name of simplicity. Unfortunately, when I use C#, I find myself reaching for substitutes for these C++ constructs, which are inevitably more bloated and layered than the C++ approach. For example, I have seen programmers work around the absence of #include in several bloated ways: dynamically linking to external assemblies, re-defining constants in several locations (one file per project) or selecting constants from a database, etc.
As Ms. Crabapple from The Simpson's once said, this is "pretty lame, Milhouse."
In terms of Computer Science, these compile-time features of C++ enable things like call-by-name parameter passing, which is known to be more powerful than call-by-value and call-by-reference.
Again, this is perhaps not the popular answer- any introductory C++ text will warn you off of #define, for example. But having worked with a wide variety of languages over many years, and having given consideration to the theory behind all of this, I think that many people are giving bad advice. This seems especially to be the case in the diluted sub-field known as "IT."

Passing POD structures across processes with minimum overhead. In other words, it allows us to easily handle blobs of binary data.

C# and Java force you to put your 'main()' function in a class. I find that weird, because it dilutes the meaning of a class.
To me, a class is a category of objects in your problem domain. A program is not such an object. So there should never be a class called 'Program' in your program. This would be equivalent to a mathematical proof using a symbol to notate itself -- the proof -- alongside symbols representing mathematical objects. It'll be just weird and inconsistent.
Fortunately, unlike C# and Java, C++ allows global functions. That lets your main() function to exist outside. Therefore C++ offers a simpler, more consistent and perhaps truer implementation of the the object-oriented idiom. Hence, this is one thing C++ can do, but C# and Java cannot.

I think that operator overloading is a quite nice feature. Of course it can be very much abused (like in Boost lambda).

Tight control over system resources (esp. memory) while offering powerful abstraction mechanisms optionally. The only language I know of that can come close to C++ in this regard is Ada.

C++ provides complete control over memory and as result a makes the the flow of program execution much more predictable.
Not only can you say precisely at what time allocations and deallocations of memory occurs, you can define you own heaps, have multiple heaps for different purposes and say precisely where in memory data is allocated to. This is frequently useful when programming on embedded/real time systems, such as games consoles, cell phones, mp3 players, etc..., which:
have strict upper limits on memory that is easy to reach (constrast with a PC which just gets slower as you run out of physical memory)
frequently have non homogeneous memory layout. You may want to allocate objects of one type in one piece of physical memory, and objects of another type in another piece.
have real time programming constraints. Unexpectedly calling the garbage collector at the wrong time can be disastrous.
AFAIK, C and C++ are the only sensible option for doing this kind of thing.

Well to be quite honest, you can do just about anything if your willing to write enough code.
So to answer your question, no, there is nothing you can't do in another language that C++ can't do. It's just how much patience do you have and are you willing to devote the long sleepless nights to get it to work?
There are things that C++ wrappers make it easy to do (because they can read the header files), like Office development. But again, it's because someone wrote lots of code to "wrap" it for you in an RCW or "Runtime Callable Wrapper"
EDIT: You also realize this is a loaded question.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js