Shadowing functions of C stdlib/stdio - c++

I am writing a game and for now i was able to implement a filesystem via sqlite with a class and its methods. To make life more easy i have planned to write some functions like fopen,fclose,fread,rename, etc. to be able to shadow the base functions and to direct my calls to my filesystem rather than to the original one. For the first three function everything worked fine for me with these prototypes:
File *fopen(String _Filename, String _Mode); // i have my own optimized File struct
void fclose(File *_File);
size_t fread(String *_DstBuf, size_t _ElementSize, size_t _Count, File *_File);
This worked fine as i am either returning another struct or the parameters except a File* and not a FILE*, however the rename function seems to be a bit trickier!
int rename(String _OldFilename, String _NewFilename);
This is nearly the same prototype. except that i use std::string (typedef'ed String) than const char*! Any idea how i could convince my compiler either to use my function or to ignore the stdio-one?

And what is the reason that you cannot simply use your own functions by any other name?
If the whole conflict is with overload resolution, you should simply just shadow the actual prototypes; You can make them forwards to your own functions.
However, I recommend against the general approach here: even with that 'fix' in place you will at the very best have include ordering issues, and possibly even duplicate link symbols.
If your functions don't do the same, make them use another name. Since you are using c++, you could do this vile trick (otherwise ill-advised) in MyFsFunctions.h:
namespace MyFsFunctions
{
// prototypes for fopen, fclose, fwrite, fread etc
}
using namespace MyFsFunctions;
// or:
using MyFsFunctions::fopen;
using MyFsFunctions::fclose;
using MyFsFunctions::fread;
using MyFsFunctions::fwrite; // etc...
I'm pretty sure you will still want (need) to shadow the exact function prototypes (or the compiler may still complain about ambiguous identifiers references).
Other hints:
use a fuse file system driver (on Linux/UNIX/MacOS; might be overkill, but implementing it seems a lot more robust and may even simpler than what you do here).
there is always C macros (-10 points for evil)
gnu linker has options that let's you 'replace' link symbols - mainly for debugging purposes, but you can leverage those here

How about implementing a rename with the standard signature that all it will do would be calling your Stringed version?
Doesn't sound complicated to me. Something like this:
int rename(const char *charOld, const char *charNew)
{
std::string stdOld(charOld);
std::string stdNew(charNew);
return rename(stdOld, stdNew);
}

Related

How can I find all places a given member function or ctor is called in g++ code?

I am trying to find all places in a large and old code base where certain constructors or functions are called. Specifically, these are certain constructors and member functions in the std::string class (that is, basic_string<char>). For example, suppose there is a line of code:
std::string foo(fiddle->faddle(k, 9).snark);
In this example, it is not obvious looking at this that snark may be a char *, which is what I'm interested in.
Attempts To Solve This So Far
I've looked into some of the dump features of gcc, and generated some of them, but I haven't been able to find any that tell me that the given line of code will generate a call to the string constructor taking a const char *. I've also compiled some code with -s to save the generated equivalent assembly code. But this suffers from two things: the function names are "mangled," so it's impossible to know what is being called in C++ terms; and there are no line numbers of any sort, so even finding the equivalent place in the source file would be tough.
Motivation and Background
In my project, we're porting a large, old code base from HP-UX (and their aCC C++ compiler) to RedHat Linux and gcc/g++ v.4.8.5. The HP tool chain allowed one to initialize a string with a NULL pointer, treating it as an empty string. The Gnu tools' generated code fails with some flavor of a null dereference error. So we need to find all of the potential cases of this, and remedy them. (For example, by adding code to check for NULL and using a pointer to a "" string instead.)
So if anyone out there has had to deal with the base problem and can offer other suggestions, those, too, would be welcomed.
Have you considered using static analysis?
Clang has one called clang analyzer that is extensible.
You can write a custom plugin that checks for this particular behavior by implementing a clang ast visitor that looks for string variable declarations and checks for setting it to null.
There is a manual for that here.
See also: https://github.com/facebook/facebook-clang-plugins/blob/master/analyzer/DanglingDelegateFactFinder.cpp
First I'd create a header like this:
#include <string>
class dbg_string : public std::string {
public:
using std::string::string;
dbg_string(const char*) = delete;
};
#define string dbg_string
Then modify your makefile and add "-include dbg_string.h" to cflags to force include on each source file without modification.
You could also check how is NULL defined on your platform and add specific overload for it (eg. dbg_string(int)).
You can try CppDepend and its CQLinq a powerful code query language to detect where some contructors/methods/fields/types are used.
from m in Methods where m.IsUsing ("CClassView.CClassView()") select new { m, m.NbLinesOfCode }

how can I parameterize select function in scandir

The scandir() function scans the directory dir, calling
select() on each directory entry as "int(*filter)(const struct dirent *)"
How can I pass pattern value as parameter to fnmatch(const char *pattern, const char *string, int flags) function used in filter ?
Here my sample code
int my_selectgrf(const struct dirent *namelist)
{
int r = 0;
char my_pattern[] = "*.grf";
r = fnmatch(my_pattern, namelist->d_name, FNM_PERIOD);
return (r==0)?1:0;
}
scandir("/pub/data/grf", &namelist, my_selectgrf, alphasort);
my goal is to be able to use my_pattern as input parameter.
The short answer: You can't. This is an atrociously bad API, and it's outright shameful that something like this was added to POSIX as recently as 2008 (based on a bad design in glibc). This kind of API without a way to parameterize it or pass it a context should have been abolished 20+ years ago.
With that said, there are some workarounds:
Approach 1: Use a global variable, and if your code needs to be thread-safe, ensure that only one thread can be using scandir with the given scan function at a time, by locking. This of course serializes usage, which is probably not acceptable if you actually want to be calling the function from multiple threads.
Approach 2: Use thread-local storage, either the GCC __thread keyword (or the C11 _Thread_local keyword, which GCC sadly still does not accept) or POSIX pthread_setspecific and family. This is fairly clean, but unfortunately it may not be correct; if the implementation of scandir internally used multiple threads, the parameter could fail to be available in some calls back to the scan function. At present, I don't believe there are multi-threaded implementations of scandir.
Now, the better solution:
Ditch scandir and write your own function to do the same thing, with the proper API. It's only a few lines anyway.

How to Bypass a Standard C++ Function While Maintaining Its Functionality

I am looking for a way to be able to redefine a set of POSIX functions but then end the redefinition with a call to the original function. The idea is that I am trying to create a layer that can restrict what OS API's can be called depending on which "profile" is active. This "profile" determines what set of functions are allowed and any not specified should not be used.
For example, if in one profile I am not allowed to use strcpy, I would like to be able to either cause a compile time error (via static_assert) or print something to the screen saying "strcpy is not allowed in this profile" such as below:
MY_string.h
#include <string.h>
char *strcpy(char *restrict s1, const char *restrict s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assesrt(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return strcpy(s1, s2);
#endif
}
So that way within main.cpp I can use MY_string.h
#define PROFILE_PASS_THROUGH
#include "MY_string.h"
int main()
{
char temp1[10];
char temp2[10];
sprintf(temp2, "Testing");
if (0 = strcpy(temp1, temp2))
{
printf("temp1 is %s\n", temp1);
}
return 0;
}
Now I realize that the code I have written above will not compile properly due to the redefinition of strcpy, but is there a way to allow this sort of functionality without playing around with macros or creating my own standard c and c++ libraries?
You can write a preprocessor that changes calls to the standard routine to calls to your own routine. Such a preprocessor might be complicated, depending whether you need to recognize the full C++ grammar to distinguish calls using name spaces and so on or you can get away with more casual recognition of the calls.
You can link with your own library, producing a relocatable object module with resolved names stripped. Your library would contain routines with the standard names, such as strcpy, that execute whatever code you desire and call other names, such as Mystrcpy. The object module produced by this is then linked with a second library and with the standard library. The second library contains routines with those names, such as Mystrcpy, that call the original library names strcpy. The details for doing this are of course dependent on your linker. The goal is to have a chain like this: Original code calls strcpy. This is resolved to the version of strcpy in the first library. That version calls Mystrcpy. Mystrcpy calls the standard library strcpy.
You can compile to assembly and edit the names in the assembly so that your routines are called instead of the standard library routines.
On some systems, you can use dlsym and other functions defined in <dlfcn.h> to load the dynamic library that contains the standard implementations and to call them via pointers returned by dlsym instead of by the usual names in source code.
The GCC linker has a --wrap switch that resolves calls to foo to your routine __wrap_foo and resolves calls to __real_foo (which you would use in your implementation) to the real foo.
See also Intercepting Arbitrary Functions on Windows, UNIX, and Macintosh OS X Platforms.
No, cannot be done in C++. What you want is more akin to a LISP (or derivative) language, where you can grab the slot for an existing function and 'override it in place', potentially punting back to the original implementation.
Typical way of doing is on Unix is via LD_PRELOAD, example (Unix) below proxies a function call, malloc in particular (full example):
/**
* malloc() direct call
*/
inline void * libc_malloc(size_t size)
{
typedef void* (*malloc_func_t)(size_t);
static malloc_func_t malloc_func = (malloc_func_t) dlsym(RTLD_NEXT, "malloc");
return malloc_func(size);
}
In your MY_String.h:
... blah blah
using mynamespace::strcpy;
#endif // header guard or maybe not there if using pragma
then all strcpys that are not prefixed with std:: will use yours. If you REALLY want to ban them, grep and take a shotgun with you when you find the person who used it.
If using some recent GCC (e.g. version 4.7 or newer) you could also write a GCC plugin or a GCC extension in MELT to replace every call to strcpy to your own mystrcpy. This probably will take you some work (perhaps days, not hours) but has the enormous advantage to work inside the compiler, on the GCC compiler's internal representations (Gimple). So it will be done even after inlining, etc. And since you extend the compiler, you can tailor its behavior to what you want.
MELT is a domain specific language to extend GCC. It is designed for such tasks.
You cannot avoid these functions to be called.
A C++ program can do anything it wants, it could have some code that loads the strcpy symbol from libc and runs it. If a malicious developer want to call that function, you have no way to avoid it. To do that you'd need to run the C++ code in some special environment (in a sandbox, or virtual machine), but I'm afraid such technology is not available.
If you trust the developers, and you're just looking for a way to remind them not to call certain functions, then there could be some solution.
One solution could be avoiding to #include libc headers (like cstring), and only include your own header files where you only declared the desired functions.
Another solution could be that of looking to the compiled executable in order to find out what functions are called, or to LD_PRELOAD a library that redefines (and thus overrides) standard functions to make them print a warning at runtime.
Here is how you would you change MY_string.h
#include <cstring>
namespace my_functions{
char *strcpy(char *s1, const char *s2)
{
#if defined(PROFILE_PASS_THROUGH)
printf("strcpy is not allowed in this profile\n");
return std::strcpy(s1, s2);
#elif defined(PROFILE_ERROR)
static_assert(0, "strcpy is not allowed in this profile\n");
return 0;
#else
return std::strcpy(s1, s2);
#endif
}
}
using namespace my_functions;
For this to work you cannot include or have using namespace std;

wrap a c++ library in c? (don't "extern c")

is it possible to wrap a c++ library into c?
how could i do this?
are there any existing tools?
(need to get access to a existing c++ library but only with C)
You can write object-oriented code in C, so if it's an object-oriented C++ library, it's possible to wrap it in a C interface. However, doing so can be very tedious, especially if you need to support inheritance, virtual functions and such stuff.
If the C++ library employs Generic Programming (templates), it might get really hairy (you'd need to provide all needed instances of a template) and quickly approaches the point where it's just not worth doing it.
Assuming it's OO, here's a basic sketch of how you can do OO in C:
C++ class:
class cpp {
public:
cpp(int i);
void f();
};
C interface:
#ifdef __cplusplus
extern "C" {
#endif
typedef void* c_handle;
c_handle c_create(int i)
{
return new cpp(i);
}
void c_f(c_handle hdl)
{
static_cast<cpp*>(hdl)->f();
}
void c_destroy(c_handle hdl)
{
delete static_cast<cpp*>(hdl);
}
#ifdef __cplusplus
}
#endif
Depending on your requirements, you could amend that. For example, if this is going to be a public C interface to a private C++ API, handing out real pointers as handles might make it vulnerable. In that case you would hand out handles that are, essentially, integers, store the pointers in a handle-to-pointer map, and replace the cast by a lookup.
Having functions returning strings and other dynamically sized resources can also become quite elaborate. You would need the C caller provide the buffer, but it can't know the size before-hand. Some APIs (like parts of the WIn32 API) then allow the caller to call such a function with a buffer of the length 0, in which case they return the length of the buffer required. Doing so, however, can make calling through the API horribly inefficient. (If you only know the length of the required buffer after the algorithm executed, it needs to be executed twice.)
One thing I've done in the past is to hand out handles (similar to the handle in the above code) to internally stored strings and provide an API to ask for the required buffer size, retrieve the string providing the buffer, and destroy the handle (which deletes the internally stored string).
That's a real PITA to use, but such is C.
Write a c++ wrapper that does an extern c, compile that with c++, and call your wrapper.
(don't “extern c”)
extern C only helps you to have a names in dll like you see them.
You can use
dumpbin /EXPORTS your.dll
to see what happens with names with extern C or without it.
http://msdn.microsoft.com/en-us/library/c1h23y6c(v=vs.71).aspx
To answer your question... It depends... But it is highly unlikely that you can use it without wrappings. If this C++ library uses just a simple functions and types you can just use it. If this C++ library uses a complex classes structure - probably you will be unable to use it from C without wrapping. It is because the internal of classes may be structured one way or another depending on many conditions (using inference with virtual tables or abstracting. Or in example complex C++ library may have its own object creation mechanisms so you HAVE to use it in the way it is designed or you will get unpredictable behavior).
So, I think, you have to prepare yourself for doing dome wrappings.
And here is a good article about wrapping C++ classes. It the article the Author tells about wrapping C++ classes to C# but he uses C at first step.
http://www.codeproject.com/KB/cs/marshalCPPclass.aspx
If the C++ library is written which can be compiled with C compiler with slight editting (such as changing bool to int, false to 0 and true to 1 etc), then that can be done.
But not all C++ code can be wrapped in C. Template is one feature in C++ that cannot be wrapped, or its nearly impossible.
Wrap it in C++ cpp that calls that dll, and "extern C" in that file you made.

use of char * vs std::string in different environments

I have been using std::string in my code. I was going to make a std::string and pass it by reference. However, someone suggested using a char * instead. Something about std::string is not reliable when porting code. Is that true? I have avoided using char * as I would need to do some memory management for it. Instead I find using the std::string much easier to use.
Basically I have a 10 digit output that I am storing in this string. Atm, I am not sure which would be better to use.
std::string is part of the C++ Standard, and has been since 1998. It is available in all the current C++ compilers. There really is no portability reason not to use it. If you have an API that needs to use a C-style string, you can use the std::string's c_str() member to get one from a string:
std::string s = "foo";
int n = strlen( s.c_str() );
In C++, almost every string should be std::string unless another library requires a cstring, in which case you should still be using an std::string and passing string.c_str(), unless you're using functions that work with buffers.
However, if you're writing a library and exporting functions, it's better to use const char* parameters rather than std::string parameters for portability.
Using a char * you are sure that you will not get portability issues among libraries.
If a library exports a function that uses an std::string, it might have problems communicating with another library that has been linked against a different version of the standard library.
I think that there is nothing to worry about unless you are going to provide some API to 3rd party.
Just use std::string
There's nothing unportable about std::string that isn't also an issue with char *. std::string actually uses a char * internally...
string is better. There is nothing unreliable about it on any platform. If you're worried about passing large classes, you can pass const references of your strings into functions. Makes coding faster and less bug prone.
In addition to the fact thata it's easier, std::string will probably be more efficient. Its small string optimization can keep the 10 digits in the std::string object itself, instead of putting them in another memory block off the heap.