Somebody told me that allocating with malloc is not secure anymore, I'm not a C/C++ guru but I've made some stuff with malloc and C/C++. Does anyone know about what risks I'm into?
Quoting him:
[..] But indeed the weak point of C/C++ it is the security, and the Achilles' heel is indeed malloc and the abuse of pointers. C/C++ it is a well known insecure language. [..] There would be few apps in what I would not recommend to continue programming with C++."
It's probably true that C++'s new is safer than malloc(), but that doesn't automatically make malloc() more unsafe than it was before. Did your friend say why he considers it insecure?
However, here's a few things you should pay attention to:
1) With C++, you do need to be careful when you use malloc()/free() and new/delete side-by-side in the same program. This is possible and permissible, but everything that was allocated with malloc() must be freed with free(), and not with delete. Similarly, everything that was allocated with new must be freed with delete, and never with free(). (This logic goes even further: If you allocate an array with new[], you must free it with delete[], and not just with delete.) Always use corresponding counterparts for allocation and deallocation, per object.
int* ni = new int;
free(ni); // ERROR: don't do this!
delete ni; // OK
int* mi = (int*)malloc(sizeof(int));
delete mi; // ERROR!
free(mi); // OK
2) malloc() and new (speaking again of C++) don't do exactly the same thing. malloc() just gives you a chunk of memory to use; new will additionally call a contructor (if available). Similarly, delete will call a destructor (if available), while free() won't. This could lead to problems, such as incorrectly initialized objects (because the constructor wasn' called) or un-freed resources (because the destructor wasn't called).
3) C++'s new also takes care of allocating the right amount of memory for the type specified, while you need to calculate this yourself with malloc():
int *ni = new int;
int *mi = (int*)malloc(sizeof(int)); // required amount of memory must be
// explicitly specified!
// (in some situations, you can make this
// a little safer against code changes by
// writing sizeof(*mi) instead.)
Conclusion:
In C++, new/delete should be preferred over malloc()/free() where possible. (In C, new/delete is not available, so the choice would be obvious there.)
[...] C/C++ it is a well known insecure language. [...]
Actually, that's wrong. Actually, "C/C++" doesn't even exist. There's C, and there's C++. They share some (or, if you want, a lot of) syntax, but they are indeed very different languages.
One thing they differ in vastly is their way to manage dynamic memory. The C way is indeed using malloc()/free() and if you need dynamic memory there's very little else you can do but use them (or a few siblings of malloc()).
The C++ way is to not to (manually) deal with dynamic resources (of which memory is but one) at all. Resource management is handed to a few well-implemented and -tested classes, preferably from the standard library, and then done automatically. For example, instead of manually dealing with zero-terminated character buffers, there's std::string, instead of manually dealing with dynamically allocated arrays, there std:vector, instead of manually dealing with open files, there's the std::fstream family of streams etc.
Your friend could be talking about:
The safety of using pointers in general. For example in C++ if you're allocating an array of char with malloc, question why you aren't using a string or vector. Pointers aren't insecure, but code that's buggy due to incorrect use of pointers is.
Something about malloc in particular. Most OSes clear memory before first handing it to a process, for security reasons. Otherwise, sensitive data from one app, could be leaked to another app. On OSes that don't do that, you could argue that there's an insecurity related to malloc. It's really more related to free.
It's also possible your friend doesn't know what he's talking about. When someone says "X is insecure", my response is, "in what way?".
Maybe your friend is older, and isn't familiar with how things work now - I used to think C and C++ were effectively the same until I discovered many new things about the language that have come out in the last 10 years (most of my teachers were old-school Bell Laboratories guys who wrote primarily in C and had only a cursory knowledge of C++ - and Bell Laboratories engineers invented C++!). Don't laugh at him/her - you might be there someday too!
I think your friend is uncomfortable with the idea that you have to do your own memory management - ie, its easy to make mistakes. In that regard, it is insecure and he/she is correct... However, that insecure aspect can be overcome with good programming practices, like RAII and using smart pointers.
For many applications, though, having automated garbage collection is probably fine, and some programmers are confused about how pointers work, so as far as getting new, inexperienced developers to program effectively in C/C++ without some training might be difficult. Which is maybe why your friend thinks C/C++ should be avoided.
It's the only way to allocate and deallocate memory in C natively. If you misuse it, it can be as insecure as anything else. Microsoft provides some "secure" versions of other functions, that take an extra size_t parametre - maybe your friend was referring to something similar? If that's the case, perhaps he simply prefers calloc() over malloc()?
If you are using C, you have to use malloc to allocate memory, unless you have a third-party library that will allocate / manage your memory for you.
Certainly your friend has a point that it is difficult to write secure code in C, especially when you are allocating memory and dealing with buffers. But we all know that, right? :)
What he maybe wanted to warn you is about pointers usage. Yes, that will cause problems if you don't understand how it works. Otherwise, ask what your friend meant, or ask him for a reference that proof his affirmation.
Saying that malloc is not safe is like saying "don't use system X because it's insecure".
Until that, use malloc in C, and new in C++.
If you use malloc in C++, people will look mad at you, but that's fine in very specific occasions.
There is nothing wrong with malloc as such. Your friend apparently means that manual memory management is insecure and easily leads to bugs. Compared to other languages where the memory is managed automatically by a garbage collector (not that it is not possible to have leaks - nowadays nobody cares if the program cleans up when it terminates, what matters is that something is not hogging memory while the program is running).
Of course in C++ you wouldn't really touch malloc at all (because it simply isn't functionally equivalent to new and just doesn't do what you need, assuming most of the time you don't want just to get raw memory). And in addition, it is completely possible to program using techniques which almost entirely eliminate the possibility of memory leaks and corruption (RAII), but that takes expertise.
Technically speaking, malloc was never secure to begin with, but that aside, the only thing I can think of is the infamous "OOM killer" (OOM = out-of-memory) that the Linux kernel uses. You can read up on it if you want. Other than that, I don't see how malloc itself is inherently insecure.
In C++, there is no such problem if you stick to good conventions. In C, well, practice. Malloc itself is not an inherently insecure function at all - people simply can deal with it's results inadequately.
It is not secure to use malloc because it's not possible to write a large scale application and ensure every malloc is freed in an efficient manner. Thus, you will have tons of memory leaks which may or may not be a problem... but, when you double free, or use the wrong delete etc, undefined behaviour can result. Indeed, using the wrong delete in C++ will typically allow arbitrary code execution.
The ONLY way for code written in a language like C or C++ to be secure is to mathematically prove the entire program with its dependencies factored in.
Modern memory-safe languages are safe from these types of bugs as long as the underlying language implementation isn't vulnerable (which is indeed rare because these are all written in C/C++, but as we move towards hardware JVMs, this problem will go away).
Perhaps the person was referring to the possibility of accessing data via malloc()?
Malloc doesn't affect the contents of the region that it provides, so it MAY be possible to collect data from other processes by mallocing a large area and then scanning the contents.
free() doesn't clear memory either so data paced into dynamically allocated buffers is, in principle, accessible.
I know someone who, many years ago admittedly, exploited malloc to create an inter-process communication scheme when he found that mallocs of equal size would return the address of the most recently free'd block.
Related
I have a background in Java and I'm still not fully used to the concept of pointers and scope, so sorry if the question seems silly.
I read somewhere that I should delete pointers that I have allocated on the Heap. I understand that but should I also delete a pointer that is given to me like this:
#include<dirent.h>
DIR* dir;
struct dirent* entries;
dir= opendir("D:/DIR")
entries= readdir(entries)
// Should I delete the pointers after I'm done with them?
delete entries;
delete dir;
Should I delete the pointers which are assigned from somewhere else or just going out of scope deletes them automatically?
Or is it even right to delete them since I haven't assigned them using new? But then if it's wrong how can I make sure that the memory assigned from the other methods would be deleted after I'm finished using them?
Not necessarily.
The unavoidable rule in C++ is that every new needs to be paired with a delete, and every new[] with a delete[], and that malloc, free, &c. are still available.
So it's tempting to suppose that the memory you get back from readdir is required to be released with an explicit call to delete, but that might not be the case:
It might have been allocated with a new[], or even, malloc!
The library might provide a function that you need to call that releases the memory.
If the library has a persuasion to working well with the C++ standard library, it might provide you with a deleter: which you can pass on construction to a smart pointer such as std::unique_ptr.
I think that (2) is most likely and is the most sensible since different C++ runtime environments might perform new and delete differently. (3) is an extension to this concept and if they support it then use that.
The golden rule is to check the documentation and do what it tells you.
There's no definitive answer to the question, because it always depends on the semantics by which the memory was allocated. For example in the very code example you gave, you must not use delete for deallocation, because opendir is not a C++ function (but a POSIX one) and to properly close it you call closedir. The pointer itself then can be discarded (no deletion requierd, closedir internally does the cleanup). Just make sure you don't use it after free (see: use-after-free-bug).
In general you always have to consult the manual of the function that gives you a pointer, where it's also specified exactly how to deallocate it.
Just to give you the idea:
malloc/calloc/realloc → free
fopen → fclose
X… → XFree
C is not the same as C++, notably for this aspect.
When using some external C (or C++) function provided by some external library, you should read its documentation and follow the "ownership" rules and conventions. For example if you use getline you understand that you need to free like here. If you use opendir you should use closedir. If you use sqlite3_prepare_v2 you'll need to sqlite3_finalize it, etc... Sometimes you'll think in terms of some abstract data type (like here) with a destructor-like function.
When you develop your own (public or "private") function in C, you need to document if it returns heap allocated memory and who (and how) is responsible of free (or releasing) it.
With C++, you also have smart pointers and RAII; so you generally can avoid manual new and delete. Smart pointers are helpful, but not a silver bullet.
So you should explicit and document conventions (and follow the conventions of external libraries and APIs) about ownership. Understanding and defining wisely such conventions is an important task.
Circular references are difficult to handle (consider weak pointers when applicable). I recommend reading about garbage collection concepts and techniques (e.g. the GC handbook), to at least be able to name appropriately your approaches and understand the limitations and power of reference counting. In some cases, you might even explicitly use a garbage collector library or code your own allocator.
Manual memory management is a whole program property, and that is why it is hard. There are even cases (long-living processes doing a lot of allocation) where you need to be afraid of fragmentation.
Tools like valgrind and the address sanitizer (with GCC's instrumentation options or Clang's ones) are practically very helpful to hunt memory leaks and some other memory bugs. Be also aware of ASLR. Take care to understand the virtual address space of your process. On Linux, read proc(5) then try cat /proc/$$/maps and cat /proc/self/maps in some terminal for a useful insight.
Ultimately you should consult the vendor's manual and see if their functions do the cleaning themselves or you need to call another function to do the cleaning etc. In general when talking about raw pointers in C++ you should explicitly release the allocated memory where appropriate. Whether it comes from a function or a new / new[] operator does not make a difference. To avoid this new / delete combination you can utilize smart pointers and the RAII technique.
What parts of standard C++ will call malloc/free rather than new/delete?
This MSDN article lists several cases where malloc/free will be called rather than new/delete:
http://msdn.microsoft.com/en-us/library/6ewkz86d.aspx
I'd like to know if this list is (in increasing order of goodness and decreasing order of likelihood):
True for other common implementations
Exhaustive
Guaranteed by some part of the C++ standard
The context is that I'd like to replace global new/delete and am wondering what allocations I'd miss if I did.
I'd like to know if this list is (in increasing order of goodness and decreasing order of likelihood):
1. True for other common implementations
2. Exhaustive
3. Guaranteed by some part of the C++ standard
I'd say you cannot really tell from that list (I suppose the one given in the Remarks section) what other C++ implementations than MS will use.
The C++ implementation is free to use any of the OS provided system calls arbitrarily. So the answer for all 3 of your questions is: No.
As for use of malloc() vs new() in implementations of the C++ specific part of the compiler ABI:
I think you can suppose that C++ specific implementations will use new() or placement new for any allocator implementations.
If those listed methods use new() (most unlikely) or malloc() internally to allocate memory doesn't matter for a user of the C++ standard library implementations.
NOTE:
If you're asking from the background of planning to override new(), or use placement new to provide your own memory allocation mechanism for all memory allocation in a programs context: That's not the way to go!
You'll have to provide your own versions of malloc(), free() et. al. then. E.g. when using GCC in conjunction with newlib, there are appropriate stubs you can use for this.
A new is basically a wrapped malloc. The compiler is allowed to use stdio functions at will, for example if you try and implement your own memcpy you'll get some weird recursion. If the compiler sees you copying more than a certain amount (say a dumb bit-for-bit copy constructor) it will use memcpy.
So yes, new is sort of a lie, new means "allocate some memory and construct something there and let me write it as one thing", if you allocate an array of floats say, they are uninitialised, malloc will probably be directly used.
Notice I say probably, I'm not sure if they're set to zero these days :P
Anyway, all compiler optimisations ('cept copy elisioning and other return-value-optimisation stuff - BUT THIS IS THE ONLY EXCEPTION) are invisible to you, that is the point. The program cannot tell it was optimised, you'd have to be timing it and stuff. For example:
(x*10)/2
This will not be optimised if the compiler has no idea about the range of x, because x*10 could overflow, but x*5 might not. So if it optimised it'd change the result.
if(x>0 && x<10) {
(x*10)/2
}
will become x*5 because the compiler, being really smart (much more than this) sees "there's no way x*10 can overflow, so x*5 is safe."
If you have a global new/delete that you defined, the compiler cannot optimise because it cannot know it'll have no effects if it does. If you define your own everything it "simplified" to malloc/free will go away.
NOTE:_
I've deliberately ignored the malloc and type-saftey stuff. It's not relevant.
The compiler assumes that malloc, free, memcpy and so forth are all super-optimised and will use them ONLY WHERE SAFE - as described above. There's a GCC thread on the mailing list somewhere where I learned of the memcpy thing.
Calloc and malloc are much, much more low level than new and delete. Firstly malloc and calloc are not safe, because u use cast on type whatever you want, and access of data in that memory is uncontrolled. (You can end up writing on someone else's memory) If you are doing some real low level programming you will have to use malloc and calloc. If you are regular programmer just use new and delete they are much easier. Why do you need precise implementation? (I have to say implementation depends because there are many different ones)
I learned C# and now I'm learning C++. The whole point of releasing a memory is new for me, and I want to know when I need to worry about memory releasing and when I don't.
From what I understand, the only case I have to worry about the release of memory, is when I used new operator, so I should to release the memory by using delete.
But in these cases there is no need to release the memory:
Class variables (Members), or static variables.
Local variables in function.
STL family (string, list, vector, etc.).
Is this true?
And are there other cases where I have to worry about memory releasing?
You basically got it right: You need to balance new with delete, new[] with delete[], and malloc with free.
Well-written C++ will contain almost none of those, since you leave the responsibiltiy for dynamic memory and lifetime management to suitable container or manager classes, most notably std::vector and std::unique_ptr.
As a general rule of thumb I tend to abide by the following:
If I code a new/new[] i immediately code the corresponding delete/delete[]
Likewise any malloc/calloc is immediately followed by the relevant free
This avoids many nasty situations where you can generate a memory leak.
If you are new to C++ I would not get used to malloc and its many variants, it requires a lot of scaffolding to remain type-safe, which unless truly necessary can be counted as a bad thing, however, as mentioned, there are times it is necessary: for example, when having to use C-based libraries/APIs then you may conceivably need to use them.
In the main stay well clear of them and your life will be much easier.
Note: I mention the points above, as having gone from C to C++ I have had to face up to a lot of old tried and tested techniques from C which cause problems in C++.
I saw some post about implement GC in C and some people said it's impossible to do it because C is weakly typed. I want to know how to implement GC in C++.
I want some general idea about how to do it. Thank you very much!
This is a Bloomberg interview question my friend told me. He did badly at that time. We want to know your ideas about this.
Garbage collection in C and C++ are both difficult topics for a few reasons:
Pointers can be typecast to integers and vice-versa. This means that I could have a block of memory that is reachable only by taking an integer, typecasting it to a pointer, then dereferencing it. A garbage collector has to be careful not to think a block is unreachable when indeed it still can be reached.
Pointers are not opaque. Many garbage collectors, like stop-and-copy collectors, like to move blocks of memory around or compact them to save space. Since you can explicitly look at pointer values in C and C++, this can be difficult to implement correctly. You would have to be sure that if someone was doing something tricky with typecasting to integers that you correctly updated the integer if you moved a block of memory around.
Memory management can be done explicitly. Any garbage collector will need to take into account that the user is able to explicitly free blocks of memory at any time.
In C++, there is a separation between allocation/deallocation and object construction/destruction. A block of memory can be allocated with sufficient space to hold an object without any object actually being constructed there. A good garbage collector would need to know, when it reclaims memory, whether or not to call the destructor for any objects that might be allocated there. This is especially true for the standard library containers, which often make use of std::allocator to use this trick for efficiency reasons.
Memory can be allocated from different areas. C and C++ can get memory either from the built-in freestore (malloc/free or new/delete), or from the OS via mmap or other system calls, and, in the case of C++, from get_temporary_buffer or return_temporary_buffer. The programs might also get memory from some third-party library. A good garbage collector needs to be able to track references to memory in these other pools and (possibly) would have to be responsible for cleaning them up.
Pointers can point into the middle of objects or arrays. In many garbage-collected languages like Java, object references always point to the start of the object. In C and C++ pointers can point into the middle of arrays, and in C++ into the middle of objects (if multiple inheritance is used). This can greatly complicate the logic for detecting what's still reachable.
So, in short, it's extremely hard to build a garbage collector for C or C++. Most libraries that do garbage collection in C and C++ are extremely conservative in their approach and are technically unsound - they assume that you won't, for example, take a pointer, cast it to an integer, write it to disk, and then load it back in at some later time. They also assume that any value in memory that's the size of a pointer could possibly be a pointer, and so sometimes refuse to free unreachable memory because there's a nonzero chance that there's a pointer to it.
As others have pointed out, the Boehm GC does do garbage collection for C and C++, but subject to the aforementioned restrictions.
Interestingly, C++11 includes some new library functions that allow the programmer to mark regions of memory as reachable and unreachable in anticipation of future garbage collection efforts. It may be possible in the future to build a really good C++11 garbage collector with this sort of information. In the meantime though, you'll need to be extremely careful not to break any of the above rules.
Look into the Boehm Garbage Collector.
C isn't C++, but both have the same "weakly typed" issues. It's not the implicit typecasts that cause an issue, though, but the tendency towards "punning" (subverting the type system), especially in data structure libraries.
There are garbage collectors out there for C and/or C++. The Boehm conservative collector is probably the best know. It's conservative in that, if it sees a bit pattern that looks like a pointer to some object, it doesn't collect that object. That value might be some other type of value completely, so the object could be collected, but "conservative" means playing safe.
Even a conservative collector can be fooled, though, if you use calculated pointers. There's a data structure, for example, where every list node has a field giving the difference between the next-node and previous-node addresses. The idea is to give double-linked list behaviour with a single link per node, at the expense of more complex iterators. Since there's no explicit pointer anywhere to most of the nodes, they may be wrongly collected.
Of course this is a very exceptional special case.
More important - you can either have reliable destructors or garbage collection, not both. When a garbage cycle is collected, the collector cannot decide which destructor to call first.
Since the RAII pattern is pervasive in C++, and that relies on destructors, there is IMO a conflict. There may be valid exceptions, but my view is that if you want garbage collection, you should use a language that's designed from the ground up for garbage collection (Java, C#, ...).
You could either use smart pointers or create your own container object which will track references and handle memory allocation etc. Smart pointers would probably be preferable. Often times you can avoid dynamic heap allocation altogether.
For example:
char* pCharArray = new char[128];
// do some stuff with characters
delete [] pCharArray;
The danger with the above being if anything throws between the new and the delete your delete will not be executed. Something like above could easily be replaced with safer "garbage collected" code:
std::vector<char> charArray;
// do some stuff with characters
Bloomberg has notoriously irrelevant interview questions from a practical coding standpoint. Like most interviewers they are primarily concerned with how you think and your communication skills than the actual solution though.
You can read about the shared_ptr struct.
It implements a simple reference-counting garbage collector.
If you want a real garbage collector, you can overload the new operator.
Create a struct similar to shared_ptr, call it Object.
This will wrap the new object created. Now with overloading its operators, you can control the GC.
All you need to do now, is just implement one of the many GC algorithms
The claim you saw is false; the Boehm collector supports C and C++. I suggest reading the Boehm collector's documentation (particularly this page)for a good overview of how one might write a garbage collector in C or C++.
This question already has answers here:
Closed 13 years ago.
Duplicate of: In what cases do I use malloc vs new?
Just re-reading this question:
What is the difference between "new" and "malloc" and "calloc" in C++?
I checked the answers but nobody answered the question:
When would I use malloc instead of new?
There are a couple of reasons (I can think of two).
Let the best float to the top.
A couple that spring to mind:
When you need code to be portable between C++ and C.
When you are allocating memory in a library that may be called from C, and the C code has to free the allocation.
From the Stroustrup FAQ on new/malloc I posted on that thread:
Whenever you use malloc() you must consider initialization and convertion of the return pointer to a proper type. You will also have to consider if you got the number of bytes right for your use. There is no performance difference between malloc() and new when you take initialization into account.
This should answer your question.
The best reason I can think of to use malloc in C++ is when interacting with a pure C API. Some C APIs I've worked with take ownership of the memory of certain parameters. As such they are responsible for freeing the memory and hence the memory must be free-able via free. Malloc will work for this puprose but not necessarily new.
In C++, just about never. new is usually a wrapper around malloc that calls constructors (if applicable.)
However, at least with Visual C++ 2005 or better, using malloc can actually result in security vulnerabilities over new.
Consider this code:
MyStruct* p = new MyStruct[count];
MyStruct* p = (MyStruct*)malloc(count* sizeof(MyStruct));
They look equivelent. However, the codegen for the first actually checks for an integer overflow in count * sizeof(MyStruct). If count comes from an unstrusted source, it can cause an integer overflow resulting in a small amount of memory being allocated, but then when you use count you overrun the buffer.
Everybody has mentioned (using slightly different words) when using a C library that is going to use free() and there are a lot of those around.
The other situation I see is:
When witting your own memory management (because for some reason that you have discovered through modeling the default is not good enough). You could allocate memory block with malloc and the initialization the objects within the pools using placement new.
One of the reason is that in C++, you can overload the new operator.
If you wanted to be sure to use the system library memory allocation in your code, you could use malloc.
A C++ programmer should rarely if ever need to call malloc. The only reason to do so that I can think of would be a poorly constructed API which expected you to pass in malloc'd memory because it would be doing the free. In your own code, new should always be the equal of malloc.
If the memory is to be released by free() (in your or someone elses code), it's pretty darn required to use malloc.
Otherwise I'm not sure. One contrived case is when you don't want destructor(s) to be run on exit, but in that case you should probably have objects that have a no-op dtor anyway.
You can use malloc when you don't want to have to worry about catching exceptions (or use a non-throwing version of new).