VC++ wcscpy_s randomly assert on "Buffer is too small" - c++

You can see the parameter sent to the function does not exceed the buffer size from the code below.
This problem happened randomly, and only happened in debug build.
#include <thread>
#include <sstream>
#define BUF_SZ 32
int main()
{
wchar_t src[BUF_SZ]{};
bool running = true;
std::thread th([&] {
for (double g = 0; g < 100000; g += .1)
{
std::wstringstream ws;
ws << g;
wcscpy_s(src, BUF_SZ, ws.str().c_str());
}
running = false;
});
wchar_t dst[BUF_SZ]{};
while (running)
wcscpy_s(dst, src); // assert on "Buffer is too small" randomly
th.join();
return 0;
}

Thanks to Mr. Steve Wishnousky from MSFT VC++ team, here is the complete explanation of the problem.
Wcscpy_s does not operate atomically on the buffers and will only work
correctly if the buffers do not change contents during the runtime of
wcscpy_s.
Another thing to note is that in Debug mode, the wcscpy_s function
will fill the rest of the buffer in with a debug mark (0xFE) to
indicate that the data there is now invalid to assume it's contents,
in order to detect potential runtime errors.
The error happens differently every time of course, but lets assume
this error happens when src=1269.9 and wcscpy_s(dst, src) is called.
The actual contents of src is: "1 2 6 9 . 9 null 0xfe 0xfe ...".
wcscpy_s copies over the 1269.9 but as it's about to read the null,
the other wcscpy_s just wrote a new value to src so it's now: "1 2 7 0
null 0xfe 0xfe ...". Instead of reading the null corresponding from
the previous src, it reads the 0xfe, so it thinks this is a real
character. Since there is no null terminator until we reach the end of
the buffer, the Debug runtime asserts that the buffer was too small
for the input.
In the Release build, the 0xFE debug marks aren't placed in the
buffer, so it will eventually find a null character. You can also
disable the debug marks by calling _CrtSetDebugFillThreshold:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/crtsetdebugfillthreshold?view=vs-2019.
Note that the Debug marks are actually catching a real correctness
problem here though. This "buffer changed during wcscpy_s" issue could
happen for any value. For example, if src=1269.9, wcscpy_s could copy
over the 126, but then as it's about to read the 9, src is updated to
1270 and the value that would end up in dest would be "1260".

Since the string copy will need to copy "src" characters and a terminating null character you'll need to provide a buffer which is at least one character bigger than sizeof "src".
I suggest you could try to use:
wcscpy_s(dst, sizeof src+1, src);

Related

strncat undefined behaviour on particular number

I know the size of destination is small so that my code not working correctly but all I want to know when I put 35 in destination character print in infinite loop why it become infinite only when I put 35 in other number it crash or work in bigger number.
I am using Windows 7 Code Block with gcc.
using namespace std;
int main()
{
char dest[35] = "This is an example";//when put another number it work or crash but at this number it print infinite number of character why what is logic
char src[50] = " to show working of strncat() this is not added";
strncat(dest, src, 29);
cout << dest ;
return 0;
}
You have 17 characters plus a null byte in the initialized dest array.
When you call strncat(dest, src, 29), you say "it is safe to add 29 extra characters to dest" — which is nonsense as you can only add 17 characters without overflowing the array.
You invoked undefined behaviour. That means the program can work, or crash, or go into an infinite loop, and all those behaviours are OK because undefined behaviour doesn't have to behave in any specific way.
Note that strncat(dst, sizeof(dst), very_long_string) is a boundary error, even if dst[0] == '\0'. You can use strncat(dst, sizeof(dst)-1, very_long_string) if the destination is empty. (The very long string is any string longer than sizeof(dst) - 1, of course.)

Memcpy Readable Range

Small code bits:
int main()
{
char buf[18];
char buf2[18];
int newlength = 16;
memset(buf, '0', 16);
for (int i = newlength; i < 18; i++)
buf[i] = 0x00;
memcpy(buf2, buf, 18);
return 0;
}
First I want to set a portion of an array to a specific value and then I would like to fill the rest with 0x00. Then I'd like to copy it to another array.
On MS VS2013, I receive a warning as readable range for buf to be between 0 and 15. (Code Analysis for C/C++ Warnings. C6385 Read Overrun) Why? Does memcpy ignore the bits set to 0x00?
This message from the code analyzer seems to be based on the principle that the buffer content would be defined alone as output from memset(). It misses the point that the loop after memset() completes this input.
If you double click on the warning, you can get a highlighting of the lines considered for triggering this warning.
But the code you write is correct, so you don't have to worry about the result here. The online documentation says "might" no "will" :
This warning indicates that the readable extent of the specified
buffer might be smaller than the index used to read from it.
Additional remarks:
When making what is going on more obvious for the analyzer, it still brings the same abusive warnings:
memset(buf, '0', 16);
memset(buf + 16, 0x00, 2); // for replacing the loop
In this case, the analyzer notices the second memset(). But as it doesn't affect buf from its beginning, it as an input/output to a buffer operation without taking into consideration the additional length.
Even this kind over-precautiononous code raises the warning:
memset(buf, 0x00, sizeof(buf)); // completeky initalize the buffer
memset(buf, '0', 16); // overwrite just the beginning
Here, it seems that as soon as a memxxx() operation targets the begin of the buffer, the length of that operation is considered to be the sole initialized part.
So, yes the warning is annoying, but trust your code. I could only get rid of the warning by making really weird an unefficient coding:
memset(buf, 0x00, sizeof(buf)); // 18 bytes get initalized
memset(buf + 1, '0', 15); // not begin of buffer
buf[0] = '0'; // not a memxxx() operation
Unfortunately the configuration of the analyzer doesn't allow to disable just this single rule, but the whole set of security verification rules.
Seems to be a bug in compiler / lint tool (depends on who shows you the warning)
You're initializing 16 bytes. You're accessing 18 bytes. It seems to think that the access is reading, even though it shouldn't.

C/C++ Valid pointer but getting EXC_BAD_ACCESS and KERN_INVALID_ADDRESS

I've been racking my brain over this for hours and I can't find anything.
Potentially relevant information:
Running on OSX 10.10.1 Yosemite
This same code works perfectly fine on Windows.
Every time I run this, it breaks at the exact same spot.
The application is an OpenGL app that uses glfw3 to create a window.
There are no threads, it's just a single threaded app, so the pointer is not being overwritten or being deallocated.
These two methods are contained in two separate .c files that are compiled as c++ and contained within a built library that I link to. Other methods in the library work just fine.
OPchar* OPstreamReadLine(OPstream* stream) {
OPchar buffer[500];
i32 len, i;
// ALL WORKS FINE
// check to see if we are at the end of the stream or not
if(stream->_pointer >= stream->Length) return 0;
// Prints out the contents of the stream, and the start of the pointer just fine
OPlog("Buffer %s | Pointer %d", stream->Data, stream->_pointer);
sscanf((OPchar*)stream->Data stream->_pointer, "%500[^\n]", buffer);
len = strlen(buffer);
stream->_pointer = len 1;
// Spits out 'Read Hello of len 5'
OPlog("Read %s of len %d", buffer, len);
// ISSUE STARTS HERE
// OPchar is a typedef of char
// STEP 1. Make the call
OPchar* result = OPstringCreateCopy(buffer);
// STEP 6. The Pointer is printed out correctly, its the same thing
// ex: Pos: 0xd374b4
OPlog("Pos: 0x%x", result);
// STEP 7. This is where it breaks
// EXC_BAD_ACCESS and KERN_INVALID_ADDRESS
// What happened?
// Did returning the pointer from the function break it?
OPlog("len: %d", strlen(result));
OPlog("Result %s", result);
return result;
}
OPchar* OPstringCreateCopy(const OPchar* str) {
i32 len = strlen(str);
// STEP 2. Prints out 'Hello 5'
OPlog("%s %d", str, len);
// Allocates it (just uses malloc)
OPchar* result = (OPchar*)OPalloc(sizeof(OPchar) * (len + 1));
// Copies the previous string into the newly created one
strcpy(result, str);
// Ensures that it's null terminated
// even though strcpy is supposed to do it
result[len] = NULL;
// STEP 3. Gives a good pointer value
// ex: Pos: 0xd374b4
OPlog("Pos: 0x%x", result);
// STEP 4. Prints out '5'
OPlog("len: %d", strlen(result));
// STEP 5. Prints out 'Hello'
OPlog("hmmm: %s", result);
// Just return this same pointer
return result;
}
I've since replaced these functions with versions that don't use the sscanf stuff which got around the issue, however I'm now hitting the same problem with another returned pointer becoming invalid. This example was simpler to explain, so I thought I'd start there.
Here's a theory, which you may go test. Instead of using %x to print your pointers, use %p instead. You may be on a 64-bit OS and not realizing it. The problem could be that you did not supply a prototype for OPstringCreateCopy, in which case the return value was treated as an int (32 bits) instead of a pointer (64 bits). Since you are only printing out 32 bits of result, it seems like the pointer is valid, but the upper 32 bits may have been lost.
The fix for this is to make sure you always supply prototypes for all your functions. There should be some compiler warnings that you can turn on to assist you with finding uses of unprototyped functions. You might also want to go through your code and check for any other 64-bit problems, such as if you ever cast a pointer to an int.

Reading a file into a string buffer and detecting EOF

I am opening a file and placing it's contents into a string buffer to do some lexical analysis on a per-character basis. Doing it this way enables parsing to finish faster than using a subsequent number of fread() calls, and since the source file will always be no larger than a couple MBs, I can rest assured that the entire contents of the file will always be read.
However, there seems to be some trouble in detecting when there is no more data to be parsed, because ftell() often gives me an integer value higher than the actual number of characters within the file. This wouldn't be a problem with the use of the EOF (-1) macro, if the trailing characters were always -1... But this is not always the case...
Here's how I am opening the file, and reading it into the string buffer:
FILE *fp = NULL;
errno_t err = _wfopen_s(&fp, m_sourceFile, L"rb, ccs=UNICODE");
if(fp == NULL || err != 0) return FALSE;
if(fseek(fp, 0, SEEK_END) != 0) {
fclose(fp);
fp = NULL;
return FALSE;
}
LONG fileSize = ftell(fp);
if(fileSize == -1L) {
fclose(fp);
fp = NULL;
return FALSE;
}
rewind(fp);
LPSTR s = new char[fileSize];
RtlZeroMemory(s, sizeof(char) * fileSize);
DWORD dwBytesRead = 0;
if(fread(s, sizeof(char), fileSize, fp) != fileSize) {
fclose(fp);
fp = NULL;
return FALSE;
}
This always appears to work perfectly fine. Following this is a simple loop, which checks the contents of the string buffer one character at a time, like so:
char c = 0;
LONG nPos = 0;
while(c != EOF && nPos <= fileSize)
{
c = s[nPos];
// do something with 'c' here...
nPos++;
}
The trailing bytes of the file are usually a series of ý (-3) and « (-85) characters, and therefore EOF is never detected. Instead, the loop simply continues onward until nPos ends up being of higher value than fileSize -- Which is not desirable for proper lexical analysis, because you often end up skipping the final token in a stream which omits a newline character at the end.
In a Basic Latin character set, would it be safe to assume that an EOF char is any character with a negative value? Or perhaps there is just a better way to go about this?
#EDIT: I have just tried to implement the feof() function into my loop, and all the same, it doesn't seem to detect EOF either.
Assembling comments into an answer...
You leak memory (potentially a lot of memory) when you fail to read.
You haven't allowed for a null terminator at the end of the string read.
There's no point in zeroing the memory when it is all about to be overwritten by the data from the file.
Your test loop is accessing memory out of bounds; nPos == fileSize is one beyond the end of the memory you allocated.
char c = 0;
LONG nPos = 0;
while(c != EOF && nPos <= fileSize)
{
c = s[nPos];
// do something with 'c' here...
nPos++;
}
There are other problems, not previously mentioned, with this. You did ask if it is 'safe to assume that an EOF char is any character with a negative value', to which I responded No. There are several issues here, that affect both C and C++ code. The first is that plain char may be a signed type or an unsigned type. If the type is unsigned, then you can never store a negative value in it (or, more accurately, if you attempt to store a negative integer into an unsigned char, it will be truncated to the least significant 8* bits and will be treated as positive.
In the loop above, one of two problems can occur. If char is a signed type, then there is a character (ÿ, y-umlaut, U+00FF, LATIN SMALL LETTER Y WITH DIAERESIS, 0xFF in the Latin-1 code set) that has the same value as EOF (which is always negative and usually -1). Thus, you might detect EOF prematurely. If char is an unsigned type, then there will never be any character equal to EOF. But the test for EOF on a character string is fundamentally flawed; EOF is a status indicator from I/O operations and not a character.
During I/O operations, you will only detect EOF when you've attempted to read data that isn't there. The fread() won't report EOF; you asked to read what was in the file. If you tried getc(fp) after the fread(), you'd get EOF unless the file had grown since you measured how long it is. Since _wfopen_s() is a non-standard function, it might be affecting how ftell() behaves and the value it reports. (But you later established that wasn't the case.)
Note that functions such as fgetc() or getchar() are defined to return characters as positive integers and EOF as a distinct negative value.
If the end-of-file indicator for the input stream pointed to by stream is not set and a
next character is present, the fgetc function obtains that character as an unsigned
char converted to an int.
If the end-of-file indicator for the stream is set, or if the stream is at end-of-file, the end-of-
file indicator for the stream is set and the fgetc function returns EOF. Otherwise, the
fgetc function returns the next character from the input stream pointed to by stream.
If a read error occurs, the error indicator for the stream is set and the fgetc function
returns EOF.289)
289) An end-of-file and a read error can be distinguished by use of the feof and ferror functions.
This indicates how EOF is separate from any valid character in the context of I/O operations.
You comment:
As for any potential memory leakage... At this stage in my project, memory leaks are one of many problems with my code which, as of yet, are of no concern to me. Even if it didn't leak memory, it doesn't even work to begin with, so what's the point? Functionality comes first.
It is easier to head off memory leaks in error paths at the initial coding stage than to go back later and fix them — because you may not spot them because you may not trigger the error condition. However, the extent to which that matters depends on the intended audience for the program. If it is a one-off for a coding course, you may be fine. If you're the only person who'll use it, you may be fine. But if it will be installed by millions, you'll have problems retrofitting the checks everywhere.
I have swapped _wfopen_s() with fopen() and the result from ftell() is the same. However, after changing the corresponding lines to LPSTR s = new char[fileSize + 1], RtlZeroMemory(s, sizeof(char) * fileSize + 1); (which should also null-terminate it, btw), and adding if(nPos == fileSize) to the top of the loop, it now comes out cleanly.
OK. You could use just s[fileSize] = '\0'; to null terminate the data too, but using RtlZeroMemory() achieves the same effect (but would be slower if the file is many megabytes in size). But I'm glad the various comments and suggestions helped get you back on track.
* In theory, CHAR_BITS might be larger than 8; in practice it is almost always 8 and for simplicity, I'm assuming it is 8 bits here. The discussion has to be more nuanced if CHAR_BITS is 9 or more, but the net effect is much the same.

Registry problem - deleting key/values with C++

The following piece of code seems to unreliably execute and after and undeterministic time it will fail with error code 234 at the RegEnumValue function.
I have not written this code, I am merely trying to debug it. I know there is an issue with doing RegEnumValue and then deleting keys in the while loop.
I am trying to figure out first, why it is throwing this 234 error at seemingly random points, as in, it is never after a consistent number of loop iterations or anything like that.
From what I have seen it fails to fill its name buffer, but this buffer is by no means too small for its purpose, so I don't understand how it could fail??
Could someone please advice on getting rid of this 234 error thrown by the RegEnumValue funciton?
HKEY key;
DWORD dw;
int idx;
char name[8192];
DWORD namesize=4096;
std::string m_path = "SOFTWARE\\Company\\Server 4.0";
if (RegOpenKeyEx(HKEY_LOCAL_MACHINE,m_path.c_str(),0,KEY_ALL_ACCESS,&key) == ERROR_SUCCESS)
{
bool error=false;
idx=0;
long result;
long delresult;
while (true)
{
result = RegEnumValue(key,idx,(char*)name,&namesize,NULL,NULL,NULL,NULL);
if (result == ERROR_SUCCESS && !error){
delresult = RegDeleteValue(key,name);
if (delresult != ERROR_SUCCESS)
error = true;
idx++;
}
else
{
break;
}
}
RegCloseKey(key);
}
There are some errors in your code:
The 4-th parameter of RegEnumValue (the namesize) is in-out parameter. So you have to reset namesize to sizeof(name)/sizeof(name[0]) (in case of the usage char type it is just sizeof(name)) inside the while loop before every call of RegEnumValue. It's the main error in your program.
If you don't want to have ERROR_MORE_DATA error any time you have the buffer having 32,767 characters. It is the maximum size of name the the regitry value (see documentation of RegEnumValue).
It is not good to use KEY_ALL_ACCESS in the RegOpenKeyEx. I'll recomend you to change it to KEY_QUERY_VALUE | KEY_SET_VALUE. It is not a real error, but depends on your environment it could be.
It if better to use UNICODE version of all this functions to speed-up a little the code.
UPDATED: Only small comment about the usage of the UNICODE version. Intern Windows work with UNICODE characters. So usage of non-Unicode version of RegEnumValue si more slow because at the evry call a new UICODE memeory block will be allocated and converted to ANSI/Multi-byte. Moreover if you will has a value name written in a language which can't be converted in you Windows ANSI code page (Chinese, Japanese and so on) and some characters will be replaced to '?' (see WC_DEFAULTCHAR flag of WideCharToMultiByte), then it can be that the function RegDeleteValue will fail with the error code like "the value with the name is not exist".
just change the value of your fourth parameter i.e namesize from 4096 to 8192 .Always MakeSure that it should be always equal to buffer size.
The answer is at the bottom of that page:
http://msdn.microsoft.com/en-us/library/ms724865(VS.85).aspx
Please read the answer of "ERROR_MORE_DATA: lpData too small, or lpValueName too small?" question.