swscanf_s causes error when reading into a wchar_t array - c++

I am writing a simple serialization using the format L"79349 Dexter 03 05"
(Assume that the Dexter part will be always 1 word.)
This string is to be read into 3 ints and a wchar_t array
I currently have the following code:
#include <iostream>
#include <stdio.h>
#include <string>
using namespace std;
int main()
{
int id=-1,season=-1,episode=-1;
wchar_t name[128];
swscanf_s(L"79349 Dexter 03 05", L"%d %ls %d %d", &id, name, &season, &episode);
wcout << "id is " << id << endl;
wcout << "name is " << wstring(name) << endl; //wprintf(L"name is %ls",name);
wcout << "season is " << season << endl;
wcout << "episode is " << episode << endl;
}
The code above is compiled(in VS '13) without a problem, however, when executed it crashes. Using the debug option I get the message: Unhandled exception at 0xFEFEFEFE in test3.exe: 0xC0000005: Access violation executing location 0xFEFEFEFE.
By omitting some parts, I find out that this problem is occured when reading into name.
e.g The following works just fine:
swscanf_s(L"79349 Dexter 03 05", L"%d %*ls %d %d", &id, &season, &episode);
What am i doing wrong?
My guess is that I am missing something simple and trivial but cannot find out on my own. Thanks in advance.

My reputation is currently too little to comment. As Brett says, you need to use wcstok_s. What you're trying to do is "tokenise" the long string into smaller token strings. This is what wcstok_s will do for you. On the other hand, swscanf_s will attempt to convert the whole string that you pass into the first format argument.
The other reason this isn't working for you is because you haven't specified how many bytes to scan. The "_s" versions are more "secure" in that they protect from buffer overruns which can corrupt memory and cause all sorts of problems. If you replace your:
swscanf_s(L".IHATECPP.",L".%ls.",name);
with
swscanf_s(L".IHATECPP.", L".%ls.", name, _countof(name));
the result will be: IHATECPP.. The first "." (dot) isn't parsed.
This question: Split a string in C++? might help you if you can use more C++-style routines instead of the older C-style ones. If you can't for whatever reason, then this: C++ Split Wide Char String might give you some ideas instead, as it's using wcstok(). Once wcstok_s has split the original strings into smaller substrings (tokens), then you'll need to convert to integer the ones you know are going to be so.
In general, you can search for "C++ tokenize" and you should find a lot of examples.

Related

Cout unsigned char

I'm using Visual Studio 2019: why does this command do nothing?
std::cout << unsigned char(133);
It literally gets skipped by my compiler (I verified it using step-by-step debug):
I expected a print of à.
Every output before the next command is ignored, but not the previous ones. (std::cout << "12" << unsigned char(133) << "34"; prints "12")
I've also tried to change it to these:
std::cout << unsigned char(133) << std::flush;
std::cout << (unsigned char)(133);
std::cout << char(-123);
but the result is the same.
I remember that it worked before, and some of my programs that use this command have misteriously stopped working... In a blank new project same result!
I thought that it my new custom keyboard layout could be the cause, but disabling it does not change so much.
On other online compilers it works properly, so may it be a bug of Visual Studio 2019?
The "sane" answer is: don't rely on extended-ASCII characters. Unicode is widespread enough to make this the preferred approach:
#include <iostream>
int main() {
std::cout << u8"\u00e0\n";
}
This will explicitly print the character à you requested; in fact, that's also how your browser understands it, which you can easily verify by putting into e.g. some unicode character search, which will result in LATIN SMALL LETTER A WITH GRAVE, with the code U+00E0 which you can spot in the code above.
In your example, there's no difference between using a signed or unsigned char; the byte value 133 gets written to the terminal, but the way it interprets it might differ from machine to machine, basing on how it's actually set up to interpret it. In fact, in a UTF-8 console, this is simply a wrong unicode sequence (u"\0x85" isn't a valid character) - if your OS was switched to UTF-8, that might be why you're seeing no output.
You can try to use static_cast
std::cout << static_cast<unsigned char>(133) << std::endl;
Or
std::cout << static_cast<char>(133) << std::endl;
Since in mine all of this is working, it's hard to pinpoint the problem, the common sense would point to some configuration issue.

strlen() not working well with special characters

When trying to determine the length of a low-level character string with the strlen function of I have noticed that it does not work properly when the string contains Spanish characters that do not exist in English, such as the exclamation opening symbol !, accents or the letter ñ. All these elements are counted as two characters, a situation that is not fixed with Locale.
#include <cstring>
#include <iostream>
int main() {
const char * s1 = "Hola!";
const char * s2 = "¡Hola!";
std::cout << s1 << " has " << strlen(s1) << " elements, but " << s2
<< " has " << strlen(s2) << " intead of 6" << std::endl;
}
This is a work for the university on low-level strings, so it is not possible to use libraries as strings.
strlen gives you the number of non-zero char objects in the buffer pointed to by its argument, up to the first zero char. Your system is apparently using a character encoding (most likely UTF-8) where these problematic characters take up more than one byte (that is, more than one char object).
How to solve this depends on what you're trying to do. For certain operations (such as determining the size of a buffer needed to store the string), the result from strlen is 100% correct, as it's exactly what you need. For most other purposes, welcome to the vast world of character/byte/code-point/whatever nuances. You might want to read up on text encodings, Unicode etc. http://utf8everywhere.org/ might be a good site to start.
You've mentioned this is a university assignment: based on what the teaching goal is, you might need to implement some form of UTF en/de-coding, or just steer clear of non-ASCII characters.

ifstream / ofstream issue with c++?

I have been having a very hard time writing to a binary file and reading back. I am basically writing records of this format
1234|ABCD|efgh|IJKL|ABC
Before writing this record, I would write the length of this entire record ( using string.size()) and then I write the record to the binary file using ofstream as follows:
int size;
ofstream studentfile;
studentfile.open( filename.c_str(),ios::out|ios::binary );
studentfile.write((char*)&size,sizeof(int));
studentfile.write(data.c_str(),(data.size()*(sizeof(char))));
cout << "Added " << data << " to " << filename << endl;
studentfile.close();
And I read this data at some other place
ifstream ifile11;
int x;
std::string y;
ifile11.open("student.db", ios::in |ios::binary);
ifile11.read((char*)&x,sizeof(int));
ifile11.read((char*)&y,x);
cout << "X " << x << " Y " << y << endl;
first I read the length of the record into the variable x, and then read the record into string y. The problem is, the output shows x as being '0' and 'y' is empty.
I am not able figure this out. Someone who can look into this problem and provide some insight will be thanked very much.
Thank you
You can't read a string that way, as a std::string is really only a pointer and a size member. (Try doing std::string s; sizeof(s), the size will be constant no matter what you set the string to.)
Instead read it into a temporary buffer, and then convert that buffer into a string:
int length;
ifile11.read(reinterpret_cast<char*>(&length), sizeof(length));
char* temp_buffer = new char[length];
ifile11.read(temp_buffer, length);
std::string str(temp_buffer, length);
delete [] temp_buffer;
I know I am answering my own question, but I strictly feel this information is going to help everyone. For most part, Joachim's answer is correct and works. However, there are two main issues behind my problem :
1. The Dev-C++ compiler was having a hard time reading binary files.
2. Not passing strings properly while writing to the binary file, and also reading from the file. For the reading part, Joachim's answer fixed it all.
The Dev-C++ IDE didn't help me. It wrongly read data from the binary file, and it did it without me even making use of a temp_buffer. Visual C++ 2010 Express has correctly identified this error, and threw run-time exceptions and kept me from being misled.
As soon as I took all my code into a new VC++ project, it appropriately provided me with error messages, so that I could fix it all.
So, please do not use Dev-C++ unless you want to run into real troubles like thiis. Also, when trying to read strings, Joachim's answer would be the ideal way.

Parse int to string with stringstream

Well!
I feel really stupid for this question, and I wholly don't mind if I get downvoted for this, but I guess I wouldn't be posting this if I had not at least made an earnest attempt at looking for the solution.
I'm currently working on Euler Problem 4, finding the largest palindromic number of two three-digit numbers [100..999].
As you might guess, I'm at the part where I have to work with the integer I made. I looked up a few sites and saw a few standards for converting an Int to a String, one of which included stringstream.
So my code looked like this:
// tempTotal is my int value I want converted.
void toString( int tempTotal, string &str )
{
ostringstream ss; // C++ Standard compliant method.
ss << tempTotal;
str = ss.str(); // Overwrite referenced value of given string.
}
and the function calling it was:
else
{
toString( tempTotal, store );
cout << loop1 << " x " << loop2 << "= " << store << endl;
}
So far, so good. I can't really see an error in what I've written, but the output gives me the address to something. It stays constant, so I don't really know what the program is doing there.
Secondly, I tried .ToString(), string.valueOf( tempTotal ), (string)tempTotal, or simply store = temptotal.
All refused to work. When I simply tried doing an implicit cast with store = tempTotal, it didn't give me a value at all. When I tried checking output it literally printed nothing. I don't know if anything was copied into my string that simply isn't a printable character, or if the compiler just ignored it. I really don't know.
So even though I feel this is a really, really lame question, I just have to ask:
How do I convert that stupid integer to a string with the stringstream? The other tries are more or less irrelevant for me, I just really want to know why my stringstream solution isn't working.
EDIT:
Wow. Seriously. This is kind of embarrassing. I forgot to set my tempTotal variable to something. It was uninitialized, so therefore I couldn't copy anything and the reason the program gave me either a 0 or nothing at all.
Hope people can have a laugh though, so I think this question would now be better suited for deletion since it doesn't really serve a purpose unless xD But thanks to everybody who tried to help me!
Have you tried just outputting the integer as is? If you're only converting it to a string to output it, then don't bother since cout will do that for you.
else
{
// toString( tempTotal, store ); // Skip this step.
cout << loop1 << " x " << loop2 << "= " << tempTotal << endl;
}
I have a feeling that it's likely that tempTotal doesn't have the value you think it has.
I know this doesn't directly answer your question but you don't need to write your own conversion function, you can use boost
#include <boost/lexical_cast.hpp>
using boost::lexical_cast;
//usage example
std::string s = lexical_cast<std::string>(tempTotal);
Try the following:
string toString(int tempTotal)
{
ostringstream ss;
ss << tempTotal;
return ss.str();
}
string store = toString(tempTotal);
If you want to output the integer, you don't even need to convert it; just insert it into the standard output:
int i = 100;
cout << i;
If you want the string representation, you're doing good. Insert it into a stringstream as you did, and ask for it's str().
If that doesn't work, I suggest you minimize the amount of code, and try to pinpoint the actual problem using a debugger :)
Short answer: your method to convert an int to a string works. Got any other questions?

C++ - string.compare issues when output to text file is different to console output?

I'm trying to find out if two strings I have are the same, for the purpose of unit testing. The first is a predefined string, hard-coded into the program. The second is a read in from a text file with an ifstream using std::getline(), and then taken as a substring. Both values are stored as C++ strings.
When I output both of the strings to the console using cout for testing, they both appear to be identical:
ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile
However, the string.compare returns stating they are not equal. When outputting to a text file, the two strings appear as follows:
ThisIsATestStringOutputtedToAFile
T^#h^#i^#s^#I^#s^#A^#T^#e^#s^#t^#S^#t^#r^#i^#n^#g^#O^#u^#t^#p^#u^#t^#
t^#e^#d^#T^#o^#A^#F^#i^#l^#e
I'm guessing this is some kind of encoding problem, and if I was in my native language (good old C#), I wouldn't have too many problems. As it is I'm with C/C++ and Vi, and frankly don't really know where to go from here! I've tried looking at maybe converting to/from ansi/unicode, and also removing the odd characters, but I'm not even sure if they really exist or not..
Thanks in advance for any suggestions.
EDIT
Apologies, this is my first time posting here. The code below is how I'm going through the process:
ifstream myInput;
ofstream myOutput;
myInput.open(fileLocation.c_str());
myOutput.open("test.txt");
TEST_ASSERT(myInput.is_open() == 1);
string compare1 = "ThisIsATestStringOutputtedToAFile";
string fileBuffer;
std::getline(myInput, fileBuffer);
string compare2 = fileBuffer.substr(400,100);
cout << compare1 + "\n";
cout << compare2 + "\n";
myOutput << compare1 + "\n";
myOutput << compare2 + "\n";
cin.get();
myInput.close();
myOutput.close();
TEST_ASSERT(compare1.compare(compare2) == 0);
How did you create the content of myInput? I would guess that this file is created in two-byte encoding. You can use hex-dump to verify this theory, or use a different editor to create this file.
The simpliest way would be to launch cmd.exe and type
echo "ThisIsATestStringOutputtedToAFile" > test.txt
UPDATE:
If you cannot change the encoding of the myInput file, you can try to use wide-chars in your program. I.e. use wstring instead of string, wifstream instead of ifstream, wofstream, wcout, etc.
The following works for me and writes the text pasted below into the file. Note the '\0' character embedded into the string.
#include <iostream>
#include <fstream>
#include <sstream>
int main()
{
std::istringstream myInput("0123456789ThisIsATestStringOutputtedToAFile\x0 12ou 9 21 3r8f8 reohb jfbhv jshdbv coerbgf vibdfjchbv jdfhbv jdfhbvg jhbdfejh vbfjdsb vjdfvb jfvfdhjs jfhbsd jkefhsv gjhvbdfsjh jdsfhb vjhdfbs vjhdsfg kbhjsadlj bckslASB VBAK VKLFB VLHBFDSL VHBDFSLHVGFDJSHBVG LFS1BDV LH1BJDFLV HBDSH VBLDFSHB VGLDFKHB KAPBLKFBSV LFHBV YBlkjb dflkvb sfvbsljbv sldb fvlfs1hbd vljkh1ykcvb skdfbv nkldsbf vsgdb lkjhbsgd lkdcfb vlkbsdc xlkvbxkclbklxcbv");
std::ofstream myOutput("test.txt");
//std::ostringstream myOutput;
std::string str1 = "ThisIsATestStringOutputtedToAFile";
std::string fileBuffer;
std::getline(myInput, fileBuffer);
std::string str2 = fileBuffer.substr(10,100);
std::cout << str1 + "\n";
std::cout << str2 + "\n";
myOutput << str1 + "\n";
myOutput << str2 + "\n";
std::cout << str1.compare(str2) << '\n';
//std::cout << myOutput.str() << '\n';
return 0;
}
Output:
ThisIsATestStringOutputtedToAFile
ThisIsATestStringOutputtedToAFile
It turns out that the problem was that the file encoding of myInput was UTF-16, whereas the comparison string was UTF-8. The way to convert them with the OS limitations I had for this project (Linux, C/C++ code), was to use the iconv() functions. To keep the compatibility of the C++ strings I'd been using, I ended up saving the string to a new text file, then running iconv through the system() command.
system("iconv -f UTF-16 -t UTF-8 subStr.txt -o convertedSubStr.txt");
Reading the outputted string back in then gave me the string in the format I needed for the comparison to work properly.
NOTE
I'm aware that this is not the most efficient way to do this. I've I'd had the luxury of a Windows environment and the windows.h libraries, things would have been a lot easier. In this case though, the code was in some rarely used unit tests, and as such didn't need to be highly optimized, hence the creation, destruction and I/O operations of some text files wasn't an issue.