tmpnam warning saying it is dangerous - c++

I get this warning saying that tmpnam is dangerous, but I would prefer to use it, since it can be used as is in Windows as well as Linux. I was wondering why it would be considered dangerous (I'm guessing it's because of the potential for misuse rather than it actually not working properly).

From tmpnam manpage :
The tmpnam() function generates a different string each time it is called, up to TMP_MAX times. If it is called more than TMP_MAX times, the behavior is implementation defined.
Although tmpnam() generates names that are difficult to guess, it is nevertheless possible that between the time that tmpnam() returns a pathname, and the time that the program opens it, another program might create that pathname using open(2), or create it as a symbolic link. This can lead to security holes. To avoid such possibilities, use the open(2) O_EXCL flag to open the pathname. Or better yet, use mkstemp(3) or tmpfile(3).
Mktemp really create the file, so you are assured it works, whereas tmpnam returns a name, possibly already existing.

If you want to use the same symbol on multiple platforms, use a macro to define TMPNAM. As long as you pick more secure functions with the same interface, you'll be able to use it on both. You have conditional compilation somewhere in your code anyway, right?

if you speak about the compiler warning of MSVC:
These functions are deprecated because more secure versions are available;
see tmpnam_s, _wtmpnam_s.
(http://msdn.microsoft.com/de-de/library/hs3e7355(VS.80).aspx)
otherwise just read what the manpages say about the drawbacks of this function. it is mostly about a 2nd process creating exactly the same file name as your process just did.

From the tmpnam(3) manpage:
Although tmpnam() generates names that are difficult to guess, it is nevertheless possible that between the time
that tmpnam() returns a pathname, and the time that the program opens it, another program might create that path‐
name using open(2), or create it as a symbolic link. This can lead to security holes. To avoid such possibili‐
ties, use the open(2) O_EXCL flag to open the pathname. Or better yet, use mkstemp(3) or tmpfile(3).

The function is dangerous, because you are responsible for allocating a buffer that will be big enough to handle the string that tmpnam() is going to write into that buffer. If you allocate a buffer that is too small, tmpnam() has no way of knowing that, and will overrun the buffer (Causing havoc). tmpnam_s() (MS's secure version) requires you to pass the length of the buffer, so tmpnam_s know when to stop.

Related

How do I get a HANDLE to the containing directory from a file HANDLE?

Given a HANDLE to a file (e.g. C:\\FolderA\\file.txt), I want a function which will return a HANDLE to the containing directory (in the previous example, it would be a HANDLE to C:\\FolderA). For example:
HANDLE hFile = CreateFileA(
"C:\\FolderA\\file.txt",
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
HANDLE hDirectory = somefunc(hFile);
Possible implementation for someFunc:
HANDLE someFunc(HANDLE h)
{
char *path = getPath(h); // "C:\\FolderA\\file.txt"
char *parent = getParentPath(path); // "C:\\FolderA"
HANDLE hFile = CreateFileA(
parent,
GENERIC_READ,
FILE_SHARE_READ,
NULL,
OPEN_EXISTING,
FILE_ATTRIBUTE_NORMAL,
NULL);
free(parent);
free(path);
return hFile;
}
But is there a way to implement someFunc without getParentPath or without making it look at the string and removing everything after the last directory separator (because this is terrible from a performance point of view)?
I don't know what getParentPath is. I assume it's a function that searches for the trailing backslash in the string and uses that to strip off the file specification. You don't have to define such a function yourself; Windows already provides one for you—PathCchRemoveFileSpec. (Note that this assumes the specified path actually contains a file name to remove. If the path doesn't contain a file name, it will remove the trailing directory name. There are other functions you can use to verify whether a path contains a file specification.)
The older version of this function is PathRemoveFileSpec, which is what you would use on downlevel operating systems where the newer, safer function is not available.
Outside of the Windows API, there are other ways of doing the same thing. If you're targeting C++17, there is the filesystem::path class. Boost provides something similar. Or you could write it yourself with the find_last_of member function of the std::string class, if you absolutely have to. (But prefer not to re-invent the wheel. There are lots of edge cases when it comes to path manipulation that you probably won't think of, and that your testing probably won't reveal.)
You express concerns about the performance of this approach. This is nonsense. Stripping some characters from a string is not a slow operation. It wouldn't even be slow if you started searching from the beginning of the string and then, once you found the file specification, made a second copy of the string, again starting from the beginning of the string. It's a simple loop searching through the characters of a reasonable-length string, and then a simple memcpy. There is absolutely no way that this operation could be a performance bottleneck in code that does file I/O.
But, the implementation probably isn't even going to be that naïve. You can optimize it by starting the search from the end of the path string, reducing the number of characters that you have to iterate through, and you can avoid any type of memory copy altogether if you're allowed to manipulate the original string. With a C-style string, you just replace the trailing path separator (the one that demarcates the beginning of the path specification) with a NUL character (\0). With a C++-style string, you just call the erase member function.
In fact, if you really care about performance, this is virtually guaranteed to be faster than making a system call to retrieve the containing folder from a file object. System calls are a lot slower than some compiler-generated, inlinable code to iterate through a string and strip out a sub-string.
Once you have the path to the directory, you can obtain a HANDLE to it by calling the CreateFile function with the FILE_FLAG_BACKUP_SEMANTICS flag. (It is necessary to pass that flag if you want to retrieve a handle to a directory.
I have measured that this is slow and am looking for a faster way.
Your measurements are wrong. Either you've made the common mistake of benchmarking a debugging build, where the standard library functionality (e.g., std::string) is not optimized, and/or the real performance bottleneck is the file I/O. CreateFile is not a speedy function by any stretch of the imagination. I can almost guarantee that is going to be your hotspot.
Note that if you don't already have the path, it is straightforward to obtain the path from a HANDLE to a file. As was pointed out in the comments, on Windows Vista and later, you simply need to call the GetFinalPathNameByHandle function. More details are available in this article on MSDN, including sample code and an alternative for use on downlevel versions of Windows.
As was mentioned already in the comments to the question, you can optimize this further by allocating a buffer of length MAX_PATH (or perhaps even larger) on the stack. That compiles to a single instruction to adjust the stack pointer, so it won't be a performance bottleneck, either. (Okay, I lied: you actually will need two instructions—one to create space on the stack, and the other to free the allocated space on the stack. Still not a performance problem.) That way, you don't even have to do any dynamic memory allocation.
Note that for maximum robustness, especially on Windows 10, you want to handle the case that a path is longer than MAX_PATH. In such cases, your stack-allocated buffer will be too small, and the function you call to fill it will return an error. Handle that error, and allocate a larger buffer on the free store. That will be slower, but this is an edge case and probably not one that is worth optimizing. The 99% common case will use the stack-allocated buffer.
Furthermore, eryksun points out (in comments to this answer) that, although it is convenient, GetFinalPathNameByHandle requires multiple system calls to map the file object between the NT and DOS namespaces and to normalize the path. I haven't disassembled this function, so I can't confirm his claims, but I have no reason to doubt them. Under normal circumstances, you wouldn't worry about this sort of overhead or possible performance costs, but since this seems to be a big concern for your application, you can use eryksun's alternative suggestion of calling GetFileInformationByHandleEx and requesting the FileNameInfo class. GetFileInformationByHandleEx is a general, multi-purpose function that can retrieve all different sorts of information about a file, including the path. Its implementation is simpler, calling directly down to the native NtQueryInformationFile function. I would have thought GetFinalPathNameByHandle was just a user-mode wrapper providing exactly this service, but eryksun's research suggests it is doing extra work that you might want to avoid if this is truly a performance hot-spot. I have to qualify this slightly by noting that GetFileInformationByHandleEx, in order to retrieve the FileNameInfo, is going to have to create an I/O Request Packet (IRP) and call down to the underlying device driver. That's not a cheap operation, so I'm not sure that the additional overhead of normalizing the path is really going to matter. But in this case, there's no real harm in using the GetFileInformationByHandleEx approach, since it's a documented function.
If you've written the code as described but are still having measurable performance problems, then please post that code for someone to review and help you optimize. The Code Review Stack Exchange site is a great place to get help like that on working code. Feel free to leave me a link to such a question in a comment under this answer so that I don't miss it.
Whatever you do, please stop calling the ANSI versions of the Windows API functions (the ones that end with an A suffix). You want the wide-character (Unicode) versions. These end with a W suffix, and work with strings composed of WCHAR (== wchar_t) characters. Aside from the fact that the ANSI versions have been deprecated for decades now because they do not provide Unicode support (it is not optional for any application written after the year 2000 to support Unicode characters in paths), as much as you care about performance, you should be aware of the fact that all A-suffixed API functions are just stubs that convert the passed-in ANSI string to a Unicode string and then delegate to the W-suffixed version. If the function returns a string, a second conversion also must be done by the A-suffixed version, since all native APIs work with Unicode strings. Performance isn't the real reason why you should avoid calling ANSI functions, but perhaps it's one that you'll find more convincing.
There might be a way to do what you want (map a file object via a HANDLE to its containing directory), but it would require undocumented usage of the NT native API. I don't see anything at all in the documented functions that would allow you to obtain this information. It certainly isn't accessible via the GetFileInformationByHandleEx function. For better or worse, the user-mode file system API is almost entirely path-based. Presumably, it is tracked internally, but even the documented NT native API functions that take a root directory HANDLE (e.g., NtDeleteFile via the OBJECT_ATTRIBUTES structure) allow this field to be NULL, in which case the full path string is used.
As always, if you had provided more details on the bigger picture, we could probably provide a more appropriate solution. This is what the commenters were driving at when they mentioned an XY problem. Yes, people are questioning your motives because that's how we provide the most appropriate help.

Can a function access args passed to main()?

Is there a way I could access program's args outside of main() without storing references to them?
Program arguments are stored within preserved space of the program, so I see no reason for not being able to access them. Maybe there is something like const char** get_program_arguments() and int get_program_arguments_count() but I cannot find it...
My problem comes from the fact that I am rewriting a library that is used now in many programs within the company, and I need to access these programs common arguments without changing them. For example I need program name, but I cannot use ::getenv("_") as they can be executed from various shells. I cannot use GNU extension because this needs to work on Linux, AIX, SunOS using gcc, CC and so on.
Some systems do provide access to the argument list, or at least argv[0]. But it’s common practice for main to mutate argc and argv during option processing, so there is no reliably correct answer as to what a global interface for them should return.
Add to that the general undesirability of global state, and the fact that it harms debugging to have whatever low-level functions attempt to analyze the arguments to a program they might know nothing about, and you end up with don’t do that. It’s not hard to pass arguments (or, better, meaningful flags that result from decoding them) to a library.

Creating a new file avoiding race conditions

I need to develop a C++ routine performing this apparently trivial task: create a file only if it does not exist, else do nothing/raise error.
As I need to avoid race conditions, I want to use the "ask forgiveness not permission" principle (i.e. attempting the intended operation and checking if it succeeded, as opposed to checking preconditions in advance), which, to my knowledge, is the only robust and portable method for this purpose [Wikipedia article][an example with getline].
Still, I could not find a way to implement it in my case. The best I could come up with is opening a fstream in app mode (or fopening with "a"), checking the output position with tellp (C++) or ftell (C) and aborting if such position is not zero. This has however two disadvantages, namely that if the file exists it gets locked (although for a short time) and its modification date is altered.
I checked other possible combinations of ios_base::openmode for fstream, as well as the mode strings of fopen but found no option that suited my needs. Further search in the C and C++ standard libraries, as well as Boost Filesystem, proved unfruitful.
Can someone point out a method to perform my task in a robust way (no collateral effects, no race conditions) without relying on OS-specific functions?
My specific problem is in Windows, but portable solutions would be preferred.
EDIT: The answer by BitWhistler completely solves the problem for C programs. Still, I am amazed that no C++ idiomatic solution seems to exist. Either one uses open with the O_EXCL attribute as proposed by Andrew Henle, which is however OS-specific (in Windows the attribute seems to be called _O_EXCL with an additional underscore [MSDN]) or one separately compiles a C11 file and links it from the C++ code. Moreover, the file descriptor obtained cannot be converted to a stream except with nonstandard extensions (e.g. GCC's __gnu_cxx::stdio_filebuf). I hope a future version of C++ will implement the "x" subattribute and possibly also a corresponding ios:: modificator for file streams.
The new C standard (C2011, which is not part of C++) adds a new standard subspecifier ("x"), that can be appended to any "w" specifier (to form "wx", "wbx", "w+x" or "w+bx"/"wb+x"). This subspecifier forces the function to fail if the file exists, instead of overwriting it.
source: http://www.cplusplus.com/reference/cstdio/fopen/

What's a good, threadsafe, way to pass error strings back from a C shared library

I'm writing a C shared library for internal use (I'll be dlopen()'ing it to a c++ application, if that matters). The shared library loads (amongst other things) some java code through a JNI module, which means all manners of nightmare error modes can come out of the JVM that I need to handle intelligently in the application. Additionally, this library needs to be re-entrant. Is there in idiom for passing error strings back in this case, or am I stuck mapping errors to integers and using printfs to debug things?
Thanks!
My approach to the problem would be a little different from everyone else's. They're not wrong, it's just that I've had to wrestle with a different aspect of this problem.
A C API needs to provide numeric error codes, so that the code using the API can take sensible measures to recover from errors when appropriate, and pass them along when not. The errno.h codes demonstrate a good categorization of errors; in fact, if you can reuse those codes (or just pass them along, e.g. if all your errors come ultimately from system calls), do so.
Do not copy errno itself. If possible, return error codes directly from functions that can fail. If that is not possible, have a GetLastError() method on your state object. You have a state object, yes?
If you have to invent your own codes (the errno.h codes don't cut it), provide a function analogous to strerror, that converts these codes to human-readable strings.
It may or may not be appropriate to translate these strings. If they're meant to be read only by developers, don't bother. But if you need to show them to the end user, then yeah, you need to translate them.
The untranslated version of these strings should indeed be just string constants, so you have no allocation headaches. However, do not waste time and effort coding your own translation infrastructure. Use GNU gettext.
If your code is layered on top of another piece of code, it is vital that you provide direct access to all the error information and relevant context information that that code produces, and you make it easy for developers against your code to wrap up all that information in an error message for the end user.
For instance, if your library produces error codes of its own devising as a direct consequence of failing system calls, your state object needs methods that return the errno value observed immediately after the system call that failed, the name of the file involved (if any), and ideally also the name of the system call itself. People get this wrong waaay too often -- for instance, SQLite, otherwise a well designed API, does not expose the errno value or the name of the file, which makes it infuriatingly hard to distinguish "the file permissions on the database are wrong" from "you have a bug in your code".
EDIT: Addendum: common mistakes in this area include:
Contorting your API (e.g. with use of out-parameters) so that functions that would naturally return some other value can return an error code.
Not exposing enough detail for callers to be able to produce an error message that allows a knowledgeable human to fix the problem. (This knowledgeable human may not be the end user. It may be that your error messages wind up in server log files or crash reports for developers' eyes only.)
Exposing too many different fine distinctions among errors. If your callers will never plausibly do different things in response to two different error codes, they should be the same code.
Providing more than one success code. This is asking for subtle bugs.
Also, think very carefully about which APIs ought to be allowed to fail. Here are some things that should never fail:
Read-only data accessors, especially those that return scalar quantities, most especially those that return Booleans.
Destructors, in the most general sense. (This is a classic mistake in the UNIX kernel API: close and munmap should not be able to fail. Thankfully, at least _exit can't.)
There is a strong case that you should immediately call abort if malloc fails rather than trying to propagate it to your caller. (This is not true in C++ thanks to exceptions and RAII -- if you are so lucky as to be working on a C++ project that uses both of those properly.)
In closing: for an example of how to do just about everything wrong, look no further than XPCOM.
You return pointers to static const char [] objects. This is always the correct way to handle error strings. If you need them localized, you return pointers to read-only memory-mapped localization strings.
In C, if you don't have internationalization (I18N) or localization (L10N) to worry about, then pointers to constant data is a good way to supply error message strings. However, you often find that the error messages need some supporting information (such as the name of the file that could not be opened), which cannot really be handled by constant data.
With I18N/L10N to worry about, I'd recommend storing the fixed message strings for each language in an appropriately formatted file, and then using mmap() to 'read' the file into memory before you fork any threads. The area so mapped should then be treated as read-only (use PROT_READ in the call to mmap()).
This avoids complicated issues of memory management and avoids memory leaks.
Consider whether to provide a function that can be called to get the latest error. It can have a prototype such as:
int get_error(int errnum, char *buffer, size_t buflen);
I'm assuming that the error number is returned by some other function call; the library function then consults any threadsafe memory it has about the current thread and the last error condition returned to that thread, and formats an appropriate error message (possibly truncated) into the given buffer.
With C++, you can return (a reference to) a standard String from the error reporting mechanism; this means you can format the string to include the file name or other dynamic attributes. The code that collects the information will be responsible for releasing the string, which isn't (shouldn't be) a problem because of the destructors that C++ has. You might still want to use mmap() to load the format strings for the messags.
You do need to be careful about the files you load and, in particular, any strings used as format strings. (Also, if you are dealing with I18N/L10N, you need to worry about whether to use the 'n$ notation to allow for argument reordering; and you have to worry about different rules for different cultures/languages about the order in which the words of a sentence are presented.)
I guess you could use PWideChars, as Windows does. Its thread safe. What you need is that the calling app creates a PwideChar that the Dll will use to set an error. Then, the callling app needs to read that PWideChar and free its memory.
R. has a good answer (use static const char []), but if you are going to have various spoken languages, I like to use an Enum to define the error codes. That is better than some #define of a bunch of names to an int value.
return integers, don't set some global variable (like errno— even if it is potentially TLSed by an implementation); aking to Linux kernel's style of return -ENOENT;.
have a function similar to strerror that takes such an integer and returns a pointer to a const string. This function can transparently do I18N if needed, too, as gettext-returnable strings also remain constant over the lifetime of the translation database.
If you need to provide non-static error messages, then I recommend returning strings like this: error_code_t function(, char** err_msg). Then provide a function to free the error message: void free_error_message(char* err_msg). This way you hide how the error strings are allocated and freed. This is of course only worth implementing of your error strings are dynamic in nature, meaning that they convey more than just a translation of error codes.
Please havy oversight with mu formatting. I'm writing this on a cell phone...

Issue with using std::copy

I am getting warning when using the std copy function.
I have a byte array that I declare.
byte *tstArray = new byte[length];
Then I have a couple other byte arrays that are declared and initialized with some hex values that i would like to use depending on some initial user input.
I have a series of if statements that I use to basically parse out the original input, and based on some string, I choose which byte array to use and in doing so copy the results to the original tstArray.
For example:
if(substr1 == "15")
{
std::cout<<"Using byte array rated 15"<<std::endl;
std::copy(ratedArray15,ratedArray15+length,tstArray);
}
The warning i get is
warning C4996: 'std::copy': Function call with parameters
that may be unsafe
- this call relies on the caller to check that the passed
values are correct.
A possible solution is to to disable this warning is by useing -D_SCL_SECURE_NO_WARNINGS, I think. Well, that is what I am researching.
But, I am not sure if this means that my code is really unsafe and I actually needed to do some checking?
C4996 means you're using a function that was marked as __declspec(deprecated). Probably using D_SCL_SECURE_NO_WARNINGS will just #ifdef out the deprecation. You could go read the header file to know for sure.
But the question is why is it deprecated? MSDN doesn't seem to say anything about it on the std::copy() page, but I may be looking at the wrong one. Typically this was done for all "unsafe string manipulation functions" during the great security push of XPSP2. Since you aren't passing the length of your destination buffer to std::copy, if you try to write too much data to it it will happily write past the end of the buffer.
To say whether or not your usage is unsafe would require us to review your entire code. Usually there is a safer version they recommend when they deprecate a function in this manner. You could just copy the strings in some other way. This article seems to go in depth. They seem to imply you should be using a std::checked_array_iterator instead of a regular OutputIterator.
Something like:
stdext::checked_array_iterator<char *> chkd_test_array(tstArray, length);
std::copy(ratedArray15, ratedArray15+length, chkd_test_array);
(If I understand your code right.)
Basically, what this warning tells you is that you have to be absolutely sure that tstArray points to an array that is large enough to hold "length" elements, as std::copy does not check that.
Well, I assume Microsoft's unilateral deprecation of the stdlib also includes passing char* to std::copy. (They've messed with a whole range of functions actually.)
I suppose parts of it has some merit (fopen() touches global ERRNO, so it's not thread-safe) but other decisions do not seem very rational. (I'd say they took a too big swathe at the whole thing. There should be levels, such as non-threadsafe, non-checkable, etc)
I'd recommend reading the MS-doc on each function if you want to know the issues about each case though, it's pretty well documented why each function has that warning, and the cause is usually different in each case.
At least it seems that VC++ 2010 RC does not emit that warning at the default warning level.