The solution is probably obvious, but I do not see it. I have this simple C++ code:
// Build the search pattern
// sPath is passed in as a parameter into this function
trim_right_if(sPath, is_any_of(L"\\"));
wstring sSearchPattern = sPath + L"\\*.*";
My problem is that the + operator has no effect (checked in debugger). The string sSearchPattern is initialized to the value of sPath only.
Notes: sPath is a wstring.
Example of what I want to achieve:
sSearchPattern -> C:\SomePath\*.*
More Info:
When I look at sPath in the debugger, I see two NULL characters after the last character. When I look at sSearchPattern, the "\*.*" is appended, but after the two NULL characters. Any explanation for that?
This should work, and indeed works for me on VS2010, SP1:
#include <iostream>
#include <string>
int main()
{
const std::wstring sPath = L"C:\\SomePath";
const std::wstring sSearchPattern = sPath + L"\\*.*";
std::wcout << sSearchPattern << L'\n';
return 0;
}
This prints
C:\SomePath\*.*
for me.
As I found out the two NULL characters stored at the end of the string were the problem. Apparently std::wstring does not care about NULLs like good old C string does. If it thinks a string is 10 characters long, it does not care if some of those 10 characters are NULL characters. If you then append to that string, the additional characters get appended after the 10ths char. If the last characters of the string happen to be NULLs, you get:
C:\\SomePath\0\0\\*.*
Such a string cannot really be used anywhere.
How did I get the NULL characters at the end of the original string? I used wstring.resize() in some other function which pads the string with NULLs. I did this in order to pass &string[0] to a Windows API function expecting a LPWSTR.
Now that I know this does not work I use a true LPWSTR instead. That is a bit more clumsy, but it works. Coming from MFC, I thought I could use std::wstring like CString with its GetBuffer and Release methods.
Something in your real code is adding extra null characters to the end of the string. Your bug lies in that code.
String concatenation works perfectly well. std::wstring is not null terminated and so concatenation just adds on to the end of the buffer. This makes std::wstring somewhat different from a C string because it can hold null characters. I suspect this nuance is the source of all the confusion.
Related
I have a code that works just fine when compiled with the build tools v110. Recently I have upgraded the toolchain to v141 (vs 2k17) and some functions that made use of sscanf is not working anymore.
The sscanf calls that stopped working make use of this particular format string: "%s %[^\0]". It expects a stream string containing 1 string followed by a whitespace and another string which have to be put in a buffer for later treat. The first string is copied to the first buffer correctly but the second is not (sscanf returns 1 instead of 2).
Someone having this problem or have any idea of why it is happening?
A code sample to test the problem:
#include <stdio.h>
#include <tchar.h>
int main()
{
char str1[100], str2[100];
memset(str1, 0, sizeof(str1)); memset(str2, 0, sizeof(str2));
int i = sscanf("+nf foo", "%s %[^\0]", str1, str2);
return 0;
}
The sequence \0 encodes a zero byte (ASCII character NUL), which in C/C++ is a string terminator.
So your formatting string is effectively "%s %[^" with another one-char long string "]" possibly following it (possibly, because the compiler may notice the string termination and discard the unused tail).
Edit
As a string terminator, the NUL character actually can't appear in the input string (although it could appear e.g. in a file stream! I'm not sure, however, how that would be handled by fscanf()), so you need not look for it with the format specifier. If you just need to read both parts of an input string into two char arrays, just use "%s%s":
sscanf("+nf foo", "%s%s", str1, str2);
As stated by the user n.m. in the topic comments the string I was using ("%[^\0]") don't work because the char "\0" will be translated to a ASCII NUL ending the format string before the proper end of the specifier %[].
The correct string to use in this case (when you need to use the specifier %[] to read all the bytes instead of ASCII NUL) is "%[\1-\377]"
I'm trying to learn about string literals and the likes and I've been playing around with it. Currently facing the problem of being unable to wcout a string that was the concatenation of two string literals appended with the "string"s method.
std::string concat = "Hello, "s + "World!";
It doesn't have any compiler errors if I cast to a string or make a call to a string constructor to concatenate them.
I'm also having trouble getting wcout to actually output unicode characters. I use cout elsewhere in the code.
constexpr wchar_t* surname = L"shirts \u0444 \u1300";
outputs shirts but no unicode characters when I wcout << surname; If I just cout surname I get hex.
Edit: thanks to comments I have understood the problem of wcout. I didn't realize it would only work with wstring and I was avoiding ordinary cout due to having read something about not mixing the two that I have yet to fully understand.
I still can't get the symbols to print out in wchar_t* which just outputs ordinary ascii characters.
Thanks for the swift replies thus far!
wcout works for normal chars marked with u8 but nothing else it seems. Several wcout statements just aren't outputting anything after the shirt fail, I moved them before it and they were printed out but they were hex rather than characters as expected. So far only normal char* have worked. This is such a headache...
As for no Unicode console output, you may have to set the locale, that is:
std::setlocale(LC_ALL, "");
constexpr wchar_t* surname = L"shirts \u0444 \u1300";
wcout << surname;
I am making a PONG clone in C++/SDL, and I have all of my images in the directory in which the program starts. I am successfully able to find that path using GetCurrentDirectory() and open the file using strcat() to append the actual image and it will load fine, but this will change the original value, which makes it useless when I try to load the next image. How would I pass the path without changing the original value, or another way to work around this problem.
My current code:
TCHAR openingdirectorytemp [MAX_PATH];
bgtexturesurf = SDL_LoadBMP(strcat(openingdirectorytemp, "\\bg.bmp"));
Use actual C++ strings:
#include <string>
using std::string;
void child(const string str)
{
str += ".suffix"; // parameter str is a copy of argument
}
void parent()
{
string parents_string = "abc";
child(parents_string);
// parents_string is not modified
}
If you must work with TCHAR in the Windows API world, use std::basic_string<TCHAR>:
typedef std::basic_string<TCHAR> str_t; // now use str_t everywhere
and so the code becomes something like
void my_class::load_bg_bmp(const str_t &dir_path)
{
str_t file_path = dir_path + _T("\\bg.bmp")l
bgtexturesurf = SDL_LoadBMP(file_path.c_str()));
// ...
}
The TCHAR type allows for build times switching between narrow and wide characters. It is pointless to use TCHAR, but then use unwrapped narrow character string literals like "\\bg.tmp".
Also, note that strcat to an uninitialized array invokes undefined behavior. The first argument to strcat must be a string: a pointer to the first-element of a null terminated character array. An uninitialized array is not a string.
We can avoid such low-level nasties by using the C++ string class.
Although you can use C++ string as suggested by other answers, you can still keep your C approach.
What you need to do is just to create another string by copying the contents from the original, and use it for strcat:
TCHAR openingdirectorytemp [MAX_PATH];
TCHAR path [MAX_PATH];
strcpy(path, openingdirectorytemp);
bgtexturesurf = SDL_LoadBMP(strcat(path, "\\bg.bmp"));
By doing so, you create string path with a separate memory space, so strcat won't affect openingdirectorytemp
You need to make a copy of the string before concatenating if you are worried about things getting changed. In other words
string1 = "abc"
string2 = "def"
strcat(string1, string2);
Results in
string1 = "abcdef"
since that is what you asked the program to do. Instead, add
strcpy(string3, string1)
strcat(string3, string2);
Now you will have
string1 = "abc"
string3 = "abcdef"
Of course you need to make sure enough space is allocated, etc.
Once you are using c++, you can use string to compose your final pathname:
string pathname(path);
pathname += "\\bg.bmp";
bgtexturesurf = SDL_LoadBMP(pathname.c_str());
I am stumped by the behaviour of the following in my Win32 (ANSI) function:
(Multi-Byte Character Set NOT UNICODE)
void sOut( HWND hwnd, string sText ) // Add new text to EDIT Control
{
int len;
string sBuf, sDisplay;
len = GetWindowTextLength( GetDlgItem(hwnd, IDC_EDIT_RESULTS) );
if(len > 0)
{
// HERE:
sBuf.resize(len+1, 0); // Create a string big enough for the data
GetDlgItemText( hwnd, IDC_EDIT_RESULTS, (LPSTR)sBuf.data(), len+1 );
} // MessageBox(hwnd, (LPSTR)sBuf.c_str(), "Debug", MB_OK);
sDisplay = sBuf + sText;
sDisplay = sDisplay + "\n\0"; // terminate the string
SetDlgItemText( hwnd, IDC_EDIT_RESULTS, (LPSTR)sDisplay.c_str() );
}
This should append text to the control with each call.
Instead, all string concatenation fails after the call to GetDlgItemText(), I am assuming because of the typecast?
I have used three string variables to make it really obvious. If sBuf is affected then sDisplay should not be affected.
(Also, why is len 1 char less than the length in the buffer?)
GetDlgItemText() corretly returns the content of the EDIT control, and SetDlgItemText() will correctly set any text in sDisplay, but the concatenation in between is just not happening.
Is this a "hidden feature" of the string class?
Added:
Yes it looks like the problem is a terminating NUL in the middle. Now I understand why the len +1. The function ensures the last char is a NUL.
Using sBuf.resize(len); will chop it off and all is good.
Added:
Charles,
Leaving aside the quirky return length of this particular function, and talking about using a string as a buffer:
The standard describes the return value of basic_string::data() to be a pointer to an array whose members equal the elements of the string itself.
That's precisely what's needed isn't it?
Further, it requires that the program must not alter any of the values of that array.
As I understand it that is going to change along with the guarantee that all bytes are contiguous. I forget where I read a long article on this, but MS already implements this it asserted.
What I don't like about using a vector is that the bytes are copied twice before I can return them: once into the vector and again into the string. I also need to instantiate a vector object and a string object. That is a lot of overhead. If there were some string friendly of working with vectors (or CStrings) without resorting to old C functions or sopying characters one by one, I would use them. The string is very syntax friendly in that way.
The data() function on a std::string returns a const char*. You are not allowed to right into the buffer returned by it, it may be a duplicated buffer.
What you could do instead is to used a std::vector<char> as a temporary buffer.
E.g. (untested)
std::vector<char> sBuf( len + 1 );
GetDlgItemText( /* ... */, &sBuf[0], len + 1 );
std::string newText( &sBuf[0] );
newText += sText;
Also, the string you pass to SetDlgItemText should be \0 terminated so you should used c_str() not data() for this.
SetDlgItemText( /* ... */, newText.c_str() );
Edit:
OK, I've just checked the contract for GetWindowTextLength and GetDlgItemText. Check my edits above. Both will include the space for a null terminator so you need to chop it off the end of your string otherwise concatenation of the two strings will include a null terminator in the middle of the string and the SetDlgItemText call will only use the first part of the string.
There is a further complication in that GetWindowTextLength isn't guaranteed to be accurate, it only guarantees to return a number big enough for a program to create a buffer for storing the result. It is extremely unlikely that this will actually affect a dialog box item owned by the calling code but in other situations the actual text may be shorter than the returned length. For this reason you should search for the first \0 in the returned text in any case.
I've opted to just use the std::string constructor that takes a const char* so that it finds the first \0 correctly.
The standard describes the return value of basic_string::data() to be a pointer to an array whose members equal the elements of the string itself. Further, it requires that the program must not alter any of the values of that array. This means that the return value of data() may or may not be a copy of the string's internal representation and even if it isn't a copy you still aren't allowed to write to it.
I am far away from the win32 api and their string nightmare, but there is something in the code that you can check. Standard C++ strings do not need to be null terminated and nulls can happen anywhere within the string. I won't comment on the fact that you are casting away constantness with your C-style cast, which is a problem on its own, but rather on the strange effect you are
When you initially create the string you allocate extra space for the null (and initialize all elements to '\0') and then you copy the elements. At that point your string is len+1 in size and the last element is a null. After that you append some other string, and what you get is a string that will still have a null character at position len. When you retrieve the data with either data() (does not guarantee null termination!) or c_str() the returned buffer will still have the null character at len position. If that is passed to a function that stops on null (takes a C style string), then even if the string is complete, the function will just process the first len characters and forget about the rest.
#include <string>
#include <cstdio>
#include <iostream>
int main()
{
const char hi[] = "Hello, ";
const char all[] = "world!";
std::string result;
result.resize( sizeof(hi), 0 );
// simulate GetDlgItemText call
std::copy( hi, hi+sizeof(hi), const_cast<char*>(result.data()) ); // this is what your C-style cast is probably doing
// append
result.append( all );
std::cout << "size: " << result.size() // 14
<< ", contents" << result // "Hello, \0world!" - dump to a file and edit with a binary editor
<< std::endl;
std::printf( "%s\n", result.c_str() ); // "Hello, "
}
As you can see, printf expects a C-style string and will stop when the first null character is found, so that it can seem as if the append operation never took place. On the other hand, c++ streams do work properly with std::string and will dump the whole content, checking that the strings were actually appended.
A patch to your append operation disappearing would be removing the '\0' from the initial string (reserve only len space in the string). But that is not really a good solution, you should never use const_cast (there are really few places where it can be required and this is not one of them), the fact that you don't see it is even worse: using C style casts is making your code look nicer than it is.
You have commented on another answer that you do not want to add std::vector (which would provide with a correct solution as &v[0] is a proper mutable pointer into the buffer), of course, not adding the extra space for the '\0'. Consider that this is part of an implementation file, and the fact that you use or not std::vector will not extend beyond this single compilation unit. Since you are already using some STL features, you are not adding any extra requirement to your system. So to me that would be the way to go. The solution provided by Charles Bailey should work provided that you remove the extra null character.
This is NOT an answer. I have added it here as an answer only so that I can use formatting in a long going discussion about const_cast.
This is an example where using const_cast can break a running application:
#include <iostream>
#include <map>
typedef std::map<int,int> map_type;
void dump( map_type const & m ); // implemented somewhere else for concision
int main() {
map_type m;
m[1] = 10;
m[2] = 20;
m[3] = 30;
map_type::iterator it = m.find(2);
const_cast<int&>(it->first) = 10;
// At this point the order invariant of the container is broken:
dump(); // (1,10),(10,20),(3,30) !!! unordered by key!!!!
// This happens with g++-4.0.1 in MacOSX 10.5
if ( m.find(3) == m.end() ) std::cout << "key 3 not found!!!" << std::endl;
}
That is the danger of using const_cast. You can get away in some situations, but in others it will bite back, and probably hard. Try to debug in thousands of lines where the element with key 3 was removed from the container. And good luck in your search, for it was never removed.
I spent about 4 hours yesterday trying to fix this issue in my code. I simplified the problem to the example below.
The idea is to store a string in a stringstream ending with std::ends, then retrieve it later and compare it to the original string.
#include <sstream>
#include <iostream>
#include <string>
int main( int argc, char** argv )
{
const std::string HELLO( "hello" );
std::stringstream testStream;
testStream << HELLO << std::ends;
std::string hi = testStream.str();
if( HELLO == hi )
{
std::cout << HELLO << "==" << hi << std::endl;
}
return 0;
}
As you can probably guess, the above code when executed will not print anything out.
Although, if printed out, or looked at in the debugger (VS2005), HELLO and hi look identical, their .length() in fact differs by 1. That's what I am guessing is causing the == operator to fail.
My question is why. I do not understand why std::ends is an invisible character added to string hi, making hi and HELLO different lengths even though they have identical content. Moreover, this invisible character does not get trimmed with boost trim. However, if you use strcmp to compare .c_str() of the two strings, the comparison works correctly.
The reason I used std::ends in the first place is because I've had issues in the past with stringstream retaining garbage data at the end of the stream. std::ends solved that for me.
std::ends inserts a null character into the stream. Getting the content as a std::string will retain that null character and create a string with that null character at the respective positions.
So indeed a std::string can contain embedded null characters. The following std::string contents are different:
ABC
ABC\0
A binary zero is not whitespace. But it's also not printable, so you won't see it (unless your terminal displays it specially).
Comparing using strcmp will interpret the content of a std::string as a C string when you pass .c_str(). It will say
Hmm, characters before the first \0 (terminating null character) are ABC, so i take it the string is ABC
And thus, it will not see any difference between the two above. You are probably having this issue:
std::stringstream s;
s << "hello";
s.seekp(0);
s << "b";
assert(s.str() == "b"); // will fail!
The assert will fail, because the sequence that the stringstream uses is still the old one that contains "hello". What you did is just overwriting the first character. You want to do this:
std::stringstream s;
s << "hello";
s.str(""); // reset the sequence
s << "b";
assert(s.str() == "b"); // will succeed!
Also read this answer: How to reuse an ostringstream
std::ends is simply a null character. Traditionally, strings in C and C++ are terminated with a null (ascii 0) character, however it turns out that std::string doesn't really require this thing. Anyway to step through your code point by point we see a few interesting things going on:
int main( int argc, char** argv )
{
The string literal "hello" is a traditional zero terminated string constant. We copy that whole into the std::string HELLO.
const std::string HELLO( "hello" );
std::stringstream testStream;
We now put the string HELLO (including the trailing 0) into the stream, followed by a second null which is put there by the call to std::ends.
testStream << HELLO << std::ends;
We extract out a copy of the stuff we put into the stream (the literal string "hello", plus the two null terminators).
std::string hi = testStream.str();
We then compare the two strings using the operator == on the std::string class. This operator (probably) compares the length of the string objects - including how ever many trailing null characters. Note that the std::string class does not require the underlying character array to end with a trailing null - put another way it allows the string to contain null characters so the first of the two trailing null characters is treated as part of the string hi.
Since the two strings are different in the number of trailing nulls, the comparison fails.
if( HELLO == hi )
{
std::cout << HELLO << "==" << hi << std::endl;
}
return 0;
}
Although, if printed out, or looked at
in the debugger (VS2005), HELLO and hi
look identical, their .length() in
fact differs by 1. That's what I am
guessing is causing the "==" operator
to fail.
Reason being, the length is different by one trailing null character.
My question is why. I do not
understand why std::ends is an
invisible character added to string
hi, making hi and HELLO different
lengths even though they have
identical content. Moreover, this
invisible character does not get
trimmed with boost trim. However, if
you use strcmp to compare .c_str() of
the two strings, the comparison works
correctly.
strcmp is different from std::string - it is written from back in the early days when strings were terminated with a null - so when it gets to the first trailing null in hi it stops looking.
The reason I used std::ends in the
first place is because I've had issues
in the past with stringstream
retaining garbage data at the end of
the stream. std::ends solved that for
me.
Sometimes it is a good idea to understand the underlying representation.
You're adding a NULL char to HELLO with std::ends. When you initialize hi with str() you are removing the NULL char. The strings are different. strcmp doesn't compare std::strings, it compares char* (it's a C function).
std::ends adds a null terminator, (char)'\0'. You'd use it with the deprecated strstream classes, to add the null terminator.
You don't need it with stringstream, and in fact it screws things up, because the null terminator isn't "the special null terminator that ends a string" to stringstream, to stringstream it's just another character, the zeroth character. stringstream just adds it, and that increases the character count (in your case) to seven, and makes the comparison to "hello" fail.
I think to have a good way to compare strings is to use std::find method. Do not mix C methods and std::string ones!