When opening a file, C or C++? - c++

Currently I'm using C++ for a program I'm writing and I have an, a query regarding Old v New.
Looking online I see people using the C++ syntax to open files. I'm moving to C++ so I should keep up with the times however it occurs to me is it better using the C way or the latter? I'm taking into consideration:
Security.
Speed.
Memory usage. (Although I think I might have an idea on this one.)
Thank you.

Neither plain C nor plain C++ gives you access to security features of the OS. In Windows and Linux/UNIX there are file system related security features and you have to use them in order to set or query file access rights.
Whether you're writing in C or in C++, security remains your responsibility. Neither of the languages frees you from things like input validation and error checks.
File I/O speed should be about the same on the same platform with the same compiler, unless you use different buffering modes or different sizes of buffers. The same should be true for the amount of memory used implicitly in the file I/O functions.
If you're writing in C++, you should generally use C++ I/O functions, unless there's something you can't do with them (e.g. you can't access OS-specific functionality and therefore are forced to use plain C functions provided by your OS).

Use the functionality that the C++ standard library provides you. If, and only if, you run into problems with speed or memory, start profiling and exploring other options.

Related

How does the C++ standard library work behind the scenes?

This question has been bothering me so much for the past couple of days. I was wondering how the standard library works, in terms of functionality. I couldn't find an answer anywhere, even by checking the source code provided by the LLVM compiler which is, for a beginner like me, a really complicated piece of code.
What I'm basically trying to understand here is how does the C++ standard library work. For example let's take the fstream header file which consist of a bunch of functions that help to write to and read from files.
How does it work? Does it use the OS specific API (since the library is cross platform), or what? And, if the standard library can do it, aren't I supposed to be able to mess with some files as well without calling the standard fstream file (which to my experience I can't do)?
I apologize if my questions are unclear since I'm not a native English speaker: feel free to modify this text so as to make it clearer.
Does it use the OS specific API (since the library is cross platform), or what?
At some point, the OS specific API is used. The fstream implementation does not necessarily call an OS function directly. It might use other classes, which call functions inherited from C, etc., but eventually the call chain will lead to an OS call. (Yes, the details are often too complicated for an intermediate programmer to follow. So, as a self-described beginner, your findings are not surprising.)
The library is cross-platform in the sense that on your end (the C++ programmer), the interface is the same regardless of platform. It is not, however, the same library on every platform. Each platform has its own library, exposing the same interface on the C++ side, but making use of different OS calls. (In fact, the same platform might have multiple standard libraries, as the library implementation is provided by your toolchain, not by the standards committee.)
And, if the standard library can do it, aren't I supposed to be able to mess with some files as well without calling the standard fstream file (which to my experience I can't do)?
Yes, you are allowed to. Apparently, you have not been able to yet, but with some practice and guidance you should be able to. Everything in the standard library can be recreated in your own code. The point of the standard library (and most libraries, for that matter) is to save you time, not to enable something that was otherwise unavailable. For example, you don't have to implement a file stream for every program you write; it's in the standard library so you can focus on more interesting aspects of your project.
A compiler is just a program which create executable file or library. You can use the compiler default libraries to gain time or write your own. The default libraries communicate with the os for file operation or memory allocation and provide a simple standard classes to allow the developper to write only one code which work on all target platforms supported by the compiler and the libraries. If you want to write your own you have to write each function for all your target os.
The standard library is cross-platform in a sense that its interface does not change between platforms but its implementation does - or in practical terms - if you only use C++ and its standard library, you can write your code the same way for Linux / Windows / MacOS / Android / Whatever and if you find a C++ compiler for one of those platforms that supports the language features you used, you will be able to compile your code for that platform without rewriting anything.
So while you can use std::vector or std::fstream or any other feature in the library independently of the platform you're writing for and expect the function definitions, type names, etc. to look the same, you cannot expect the executable which you compiled for PC with Windows 10 to run on a phone with Android. You cannot even expect the same executable to run on the same PC but with different system - that is what I mean by "the implementation is different"
There are two main reasons for this difference:
Processors with different architectures (x86-64 and ARM for example) use different instruction sets and as such the C++ source would need to be compiled to a completely different machine code to run properly
Computers with processors of the same architecture which have a different operating system have different ways of dynamically allocating memory, creating files, creating streams, writing to console, creating and scheduling threads etc. - which is part of the system functionality that you use via the standard library
If you really wanted to you could use HeapAlloc() instead of operator new() or CreateThread() instead of stdlib's std::thread but that would force you to both rewrite your program every time you wanted to compile it for something else than Windows and recompile it with the target platform's compiler (and by proxy learn its API). Standard library saves you from that trouble by abstracting away those system calls.
As for the fstream in particular, here is what it uses internally on most PCs nowadays.
Basically, fstream, iostream and printf works based on a kernel function write(). When your code call printf (we use printf as an example), it will finally call write() to let the kernel work on the IO stuff. After that, write() returns and printf returns and your code continues.
So if you really want to know how the printf works internally, you have to read the source code of the Kernel.
But you shouldn't do that for now.
For a beginner, do not try to go deeper when you haven't got a basic cognition about computer. A computer is a project, just like a building. So the right way to learn it is to learn it level by level. First, learning how to use brick and cement to build a building, this is what you should do for now. What you shouldn't do is that you are learning how to build a building and this is your first time to try to use brick, then you are interested in how to produce a brick and start to focus on brick, this is a wrong way to learn IT.
If you are learning C/C++, just learn it. Remember, learn it level by level. For now, knowing how to use printf is enough.

Memory Editing Functions

I'm looking for functions to write and read process memory similar to the Win32API calls in windows.h but I can't seem to find any for standard C++ and I would like it to be platform independent.
There is no standard C++ API for accessing the memory of other processes. Standard C++ does not even have the concept of a 'process' at all. Moreover, the contents of the memory of other processes is highly platform-dependent, so adding a shim layer for porting to other OSes is the least of your problems.
You can get platform independent because those kinds of API calls are dependent on the OS kernel, you'd need to create wrappers for each type(read, write) and change the internal API call based on a PP define (such as _WIN32)
The C++ standard supports memcpy(), memset(), memmove(), and memcmp(). There is also and STL utility, std::copy(). These are all platform independent.
Which functions are you using on Windows? We're not sure what it is you're asking for, but if you show us what you're doing successfully there, we'll be able to help you find the equivalents on other platforms.

file handling routines on Windows

Is it allowed to mix different file handling functions in a one system e.g.
fopen() from cstdio
open() from fstream
CreateFile from Win API ?
I have a large application with a lot of legacy code and it seems that all three methods are used within this code. What are potential risks and side effects ?
Yes, you can mix all of that together. It all boils down to the CreateFile call in any case.
Of course, you can't pass a file pointer to CloseHandle and expect it to work, nor can you expect a handle opened from CreateFile to work with fclose.
Think of it exactly the same way you think of malloc/free vs new/delete in C++. Perfectly okay to use concurrently so long as you don't mix them.
It is perfectly OK to use all of these file methods, as long as they don't need to interact. The minute you need to pass a file opened with one method into a function that assumes a different method, you'll find that they're incompatible.
As a matter of style I would recommend picking one and sticking to it, but if the code came from multiple sources that may not be possible. It would be a big refactoring effort to change the existing code, without much gain.
Your situation isn't that uncommon.
Code that is designed to be portable is usually written using standard file access routines (fopen, open, etc). Code that is OS-specific is commonly written using that OS's native API. Your large application is most likely a combination of these two types of code. You should have no problem mixing the file access styles in the same program as long as you remember to keep them straight (they are not interchangeable).
The biggest risk involved here is probably portability. If you have legacy code that has been around for a while, it probably uses the standard C/C++ file access methods, especially if it pre-dates the Win32 API. Using the Win32 API is acceptable, but you must realize that you are binding your code to the scope and lifetime of that API. You will have to do extra work to port that code to another platform. You will also have to re-work this code if, say, in the future Microsoft obsoletes the Win32 API in favor of something new. The standard C/C++ methods will always be there, constant and unchanging. If you want to help future-proof your code, stick to standard methods and functions as much as possible. At the same time, there are some things that require the Win32 API and can't be done using standard functions.
If you are working with a mix of C-style, C++-style, and Win32-style code, then I would suggest separating (as best as is reasonably possible) your OS-specific code and your portable code into separate modules with well-defined APIs. If you have to re-write your Win32 code in the future, this can make things easier.

Why are drivers and firmwares almost always written in C or ASM and not C++?

I am just curious why drivers and firmwares almost always are written in C or Assembly, and not C++?
I have heard that there is a technical reason for this.
Does anyone know this?
Lots of love,
Louise
Because, most of the time, the operating system (or a "run-time library") provides the stdlib functionality required by C++.
In C and ASM you can create bare executables, which contain no external dependencies.
However, since windows does support the C++ stdlib, most Windows drivers are written in (a limited subset of) C++.
Also when firmware is written ASM it is usually because either (A) the platform it is executing on does not have a C++ compiler or (B) there are extreme speed or size constraints.
Note that (B) hasn't generally been an issue since the early 2000's.
Code in the kernel runs in a very different environment than in user space. There is no process separation, so errors are a lot harder to recover from; exceptions are pretty much out of the question. There are different memory allocators, so it can be harder to get new and delete to work properly in a kernel context. There is less of the standard library available, making it a lot harder to use a language like C++ effectively.
Windows allows the use of a very limited subset of C++ in kernel drivers; essentially, those things which could be trivially translated to C, such as variable declarations in places besides the beginning of blocks. They recommend against use of new and delete, and do not have support for RTTI or most of the C++ standard library.
Mac OS X use I/O Kit, which is a framework based on a limited subset of C++, though as far as I can tell more complete than that allowed on Windows. It is essentially C++ without exceptions and RTTI.
Most Unix-like operating systems (Linux, the BSDs) are written in C, and I think that no one has ever really seen the benefit of adding C++ support to the kernel, given that C++ in the kernel is generally so limited.
1) "Because it's always been that way" - this actually explains more than you think - given that the APIs on pretty much all current systems were originally written to a C or ASM based model, and given that a lot of prior code exists in C and ASM, it's often easier to 'go with the flow' than to figure out how to take advantage of C++.
2) Environment - To use all of C++'s features, you need quite a runtime environment, some of which is just a pain to provide to a driver. It's easier to do if you limit your feature set, but among other things, memory management can get very interesting in C++, if you don't have much of a heap. Exceptions are also very interesting to consider in this environment, as is RTTI.
3) "I can't see what it does". It is possible for any reasonably skilled programmer to look at a line of C and have a good idea of what happens at a machine code level to implement that line. Obviously optimization changes that somewhat, but for the most part, you can tell what's going on. In C++, given operator overloading, constructors, destructors, exception, etc, it gets really hard to have any idea of what's going to happen on a given line of code. When writing device drivers, this can be deadly, because you often MUST know whether you are going to interact with the memory manager, or if the line of code affects (or depends on) interrupt levels or masking.
It is entirely possible to write device drivers under Windows using C++ - I've done it myself. The caveat is that you have to be careful about which C++ features you use, and where you use them from.
Except for wider tool support and hardware portability, I don't think there's a compelling reason to limit yourself to C anymore. I often see complicated hand-coded stuff done in C that can be more naturally done in C++:
The grouping into "modules" of functions (non-general purpose) that work only on the same data structure (often called "object") -> Use C++ classes.
Use of a "handle" pointer so that module functions can work with "instances" of data structures -> Use C++ classes.
File scope static functions that are not part of a module's API -> C++ private member functions, anonymous namespaces, or "detail" namespaces.
Use of function-like macros -> C++ templates and inline/constexpr functions
Different runtime behavior depending on a type ID with either hand-made vtable ("descriptor") or dispatched with a switch statement -> C++ polymorphism
Error-prone pointer arithmetic for marshalling/demarshalling data from/to a communications port, or use of non-portable structures -> C++ stream concept (not necessarily std::iostream)
Prefixing the hell out of everything to avoid name clashes: C++ namespaces
Macros as compile-time constants -> C++11 constexpr constants
Forgetting to close resources before handles go out of scope -> C++ RAII
None of the C++ features described above cost more than the hand-written C implementations. I'm probably missing some more. I think the inertia of C in this area has more to do with C being mostly used.
Of course, you may not be able to use STL liberally (or at all) in a constrained environment, but that doesn't mean you can't use C++ as a "better C".
The comments I run into as why a shop is using C for an embedded system versus C++ are:
C++ produces code bloat
C++ exceptions take up too much
room.
C++ polymorphism and virtual tables
use too much memory or execution
time.
The people in the shop don't know
the C++ language.
The only valid reason may be the last. I've seen C language programs that incorporate OOP, function objects and virtual functions. It gets very ugly very fast and bloats the code.
Exception handling in C, when implemented correctly, takes up a lot of room. I would say about the same as C++. The benefit to C++ exceptions: they are in the language and programmers don't have to redesign the wheel.
The reason I prefer C++ to C in embedded systems is that C++ is a stronger typed language. More issues can be found in compile time which reduces development time. Also, C++ is an easier language to implement Object Oriented concepts than C.
Most of the reasons against C++ are around design concepts rather than the actual language.
The biggest reason C is used instead of say extremely guarded Java is that it is very easy to keep sight of what memory is used for a given operation. C is very addressing oriented. Of key concern in writing kernel code is avoiding referencing memory that might cause a page fault at an inconvenient moment.
C++ can be used but only if the run-time is specially adapted to reference only internal tables in fixed memory (not pageable) when the run-time machinery is invoked implicitly eg using a vtable when calling virtual functions. This special adaptation does not come "out of the box" most of the time.
Integrating C with a platform is much easier to do as it is easy to strip C of its standard library and keep control of memory accesses utterly explicit. So what with it also being a well-known language it is often the choice of kernel tools designers.
Edit: Removed reference to new and delete calls (this was wrong/misleading); replaced with more general "run-time machinery" phrase.
The reason that C, not C++ is used is NOT:
Because C++ is slower
Or because the c-runtime is already present.
It IS because C++ uses exceptions.
Most implementations of C++ language exceptions are unusable in driver code because drivers are invoked when the OS is responding to hardware interrupts. During a hardware interrupt, driver code is NOT allowed to use exceptions as that would/could cause recursive interrupts. Also, the stack space available to code while in the context of an interrupt is typically very small (and non growable as a consequence of the no exceptions rule).
You can of course use new(std::nothrow), but because exceptions in c++ are now ubiqutious, that means you cannot rely on any library code to use std::nothrow semantics.
It IS also because C++ gave up a few features of C :-
In drivers, code placement is important. Device drivers need to be able to respond to interrupts. Interrupt code MUST be placed in code segments that are "non paged", or permanently mapped into memory, as, if the code was in paged memory, it might be paged out when called upon, which will cause an exception, which is banned.
In C compilers that are used for driver development, there are #pragma directives that can control which type of memory functions end up on.
As non paged pool is a very limited resource, you do NOT want to mark your entire driver as non paged: C++ however generates a lot of implicit code. Default constructors for example. There is no way to bracket C++ implicitly generated code to control its placement, and because conversion operators are automatically called there is no way for code audits to guarantee that there are no side effects calling out to paged code.
So, to summarise :- The reason C, not C++ is used for driver development, is because drivers written in C++ would either consume unreasonable amounts of non-paged memory, or crash the OS kernel.
C is very close to a machine independent assembly language. Most OS-type programming is down at the "bare metal" level. With C, the code you read is the actual code. C++ can hide things that C cannot.
This is just my opinion, but I've spent a lot of time in my life debugging device drivers and OS related things. Often by looking at assembly language. Keep it simple at the low level and let the application level get fancy.
Windows drivers are written in C++.
Linux drivers are written in c because the kernel is written in c.
Probably because c is still often faster, smaller when compiled, and more consistent in compilation between different OS versions, and with fewer dependencies. Also, as c++ is really built on c, the question is do you need what it provides?
There is probably something to the fact that people that write drivers and firmware are usually used to working at the OS level (or lower) which is in c, and therefore are used to using c for this type of problem.
The reason that drivers and firmwares are mostly written in C or ASM is, there is no dependency on the actual runtime libraries. If you were to imagine this imaginary driver written in C here
#include <stdio.h>
#define OS_VER 5.10
#define DRIVER_VER "1.2.3"
int drivermain(driverstructinfo **dsi){
if ((*dsi)->version > OS_VER){
(*dsi)->InitDriver();
printf("FooBar Driver Loaded\n");
printf("Version: %s", DRIVER_VER);
(*dsi)->Dispatch = fooDispatch;
}else{
(*dsi)->Exit(0);
}
}
void fooDispatch(driverstructinfo *dsi){
printf("Dispatched %d\n", dsi->GetDispatchId());
}
Notice that the runtime library support would have to be pulled in and linked in during compile/link, it would not work as the runtime environment (that is when the operating system is during a load/initialize phase) is not fully set up and hence there would be no clue on how to printf, and would probably sound the death knell of the operating system (a kernel panic for Linux, a Blue Screen for Windows) as there is no reference on how to execute the function.
Put it another way, with a driver, that driver code has privilege to execute code along with the kernel code which would be sharing the same space, ring0 is the ultimate code execution privilege (all instructions allowed), ring3 is where the front end of the operating system runs in (limited execution privilege), in other words, a ring3 code cannot have a instruction that is reserved for ring0, the kernel will kill the code by trapping it as if to say 'Hey, you have no privilege to tread up ring0's domain'.
The other reason why it is written in assembler, is mainly for code size and raw native speed, this could be the case of say, a serial port driver, where input/output is 'critical' to the function in relation to timing, latency, buffering.
Most device drivers (in the case of Windows), would have a special compiler toolchain (WinDDK) which can use C code but has no linkage to the normal standard C's runtime libraries.
There is one toolkit that can enable you to build a driver within Visual Studio, VisualDDK. By all means, building a driver is not for the faint of heart, you will get stress induced activity by staring at blue screens, kernel panics and wonder why, debugging drivers and so on.
The debugging side is harder, ring0 code are not easily accessible by ring3 code as the doors to it are shut, it is through the kernel trap door (for want of a better word) and if asked politely, the door still stays shut while the kernel delegates the task to a handler residing on ring0, execute it, whatever results are returned, are passed back out to ring3 code and the door still stays shut firmly. That is the analogy concept of how userland code can execute privileged code on ring0.
Furthermore, this privileged code, can easily trample over the kernel's memory space and corrupt something hence the kernel panic/bluescreens...
Hope this helps.
Perhaps because a driver doesn't require object oriented features, while the fact that C still has somewhat more mature compilers would make a difference.
There are many style of programming such as procedural, functional, object oriented etc. Object oriented programming is more suited for modeling real world.
I would use object-oriented for device drivers if it suites it. But, most of the time when you programming device drivers, you would not need the advantages provided by c++ such as, abstraction, polymorphism, code reuse etc.
Well, IOKit drivers for MacOSX are written in C++ subset (no exceptions, templates, multiple inheritance). And there is even a possibility to write linux kernel modules in haskell.)
Otherwise, C, being a portable assembly language, perfectly catches the von Neumann architecture and computation model, allowing for direct control over all it's peculiarities and drawbacks (such as the "von Neumann bottleneck"). C does exactly what it was designed for and catches it's target abstraction model completely and flawlessly (well except for implicit assumption in single control flow which could have been generalized to cover the reality of hardware threads) and this is why i think it is a beautiful language.) Restricting the expressive power of the language to such basics eliminates most of the unpredictable transformation details when different computational models are being applied to this de-facto standard. In other words, C makes you stick to basics and allows pretty much direct control over what you are doing, for example when modeling behavior commonality with virtual functions you control exactly how the function pointer tables get stored and used when comparing to C++'s implicit vtbl allocation and management. This is in fact helpful when considering caches.
Having said that, object-based paradigm is very useful for representing physical objects and their dependencies. Adding inheritance we get object-oriented paradigm which in turn is very useful to represent physical objects' structure and behavior hierarchy. Nothing stops anyone from using it and expressing it in C again allowing full control over exactly how your objects will be created, stored, destroyed and copied. In fact that is the approach taken in linux device model. They got "objects" to represent devices, object implementation hierarchy to model power management dependancies and hacked-up inheritance functionality to represent device families, all done in C.
because from system level, drivers need to control every bits of every bytes of the memory, other higher language cannot do that, or cannot do that natively, only C/Asm achieve~

To write a bootloader in C or C++?

I am writing a program, more specifically a bootloader, for an embedded system. I am going to use a C library to interact with some of the hardware components and I have the choice of writing it either in C or C++. Is there any reason I should choose one over the other? I do not need the object oriented features of C++ but it does have a stronger type system. Could it have other language features that would make the program more robust? I know some people avoid C++ because it can (but not always) generate large firmware images.
This isn't a particularly straightforward question to answer. It depends on a number of factors including:
How you prefer to layout your code.
Whether there's a C++ compiler available for your target (and any other targets you may wish to use the bootloader on).
How critical the code size is for your application (we're talking about 10% extra maybe, not MB as suggested by another answer).
Personally, I really like classes as a way of laying out my code. Even when writing C code, I'll tend to keep everything in modular files with file-scope static functions "simulating" member functions and (a few) file-scope static variables to "simulate" member variables. Having said that, most of my existing embedded projects (all of which are relatively small scale, up to a maximum of 128kB flash including bootloader, but usually less) have tended to be written in C. Now that I have a C++ compiler though, I'm certainly considering moving to C++.
There are considerable benefits to C++ from simply using references, overloading and templates, even if you don't go as far as classes. Certainly, I'd stop short of using a lot of more advanced features, including the use of dynamic memory allocation (new). Then again, I'd avoid dynamic memory allocation (malloc etc) in embedded C as well if possible.
If you have a C++ compiler (even if it's only g++), it is worth running your code through it just for the additional type checking so that you can reduce the number of problems in your code. The C++ compiler can pick up on a few things that even static analysis tools won't spot.
For a good discussion on many invalid reasons people reject C++, see Dan Saks' article on Embedded.com.
For a boot-loader the obvious choice is C, especially on an embedded system. The generated code will need to be close to the metal, and very easy to debug, likely by dropping into assembly, which quickly becomes difficult without care in C++. Also C tool-chains are far more ubiquitous than C++ tool-chains, allowing your boot-loader to be used on more platforms. Lastly, generated binaries are typically smaller, and use less memory when written C style.
If you don't need to use Object Orientation, use C. Simple choice there. Its simpler and easier, whilst accomplishing the same task.
Some die hards will disagree, but OO is what makes C++ > C, and vice versa in a lot of circumstances.
I would use C unless there is a specific reason to use C++. For a Bootloader you are not really going to need OO.
Use the simplest tool that will accomplish the job.
Write programs in C is not the same as writing it in C++. If you know how to do it only in C++, then your choice is C++. For writing bootloader it will be better to minimize code, so you probably will have to disable standard C++ library. If you know how to write in C then you should use C — it is more common choice for such kind of tasks.
Most of the previous answers assume that your bootloader is small and simple which is typically the case; however, if it becomes more complex (i.e. you need to be able to load from an Ethernet port, a USB port, or a serial port...you need to validate the code that is being loaded before you wipe out your existing code, etc.) you may want to consider C++.
I have also found that the bootloader and the application typically share some amount of common code so you may also want to consider using the same language as your application to facilitate the code sharing.
The C language is substantially easier to parse than C++. This means a program that is both valid C and valid C++ will compile faster as a C program. Probably not a major concern, but it is just another reason why C++ is probably overkill.
Go with C++ and objchoose what language features you need. You still have full control of the output object code as long as you understand the C++ abstractions that you're using.
Use of OO can still run well if you avoid the use of virtual functions. Avoid immutable object types that require a lot of copying in order to pass values, like std::string. But, you can still use features like templates without any real impact on runtime performance.
Use C with µClibc. It will make your code simpler and reduce its footprint. Can be found in: www.uclibc.org.