std::wifstream ifstream("JobList.txt");
ifstream.imbue(std::locale(std::locale::empty(), new std::codecvt_utf8<wchar_t>));
if (!ifstream.is_open()) {
std::cout << "파일을 찾을 수 없습니다!" << std::endl;
return 0;
}
std::wstring s;
wchar_t name[20];
int priority{};
int workingTime{};
int requestTime{};
while (ifstream) {
std::getline(ifstream, s);
swscanf(s.data(), L"%[^',']s, %d, %d, %d", name, &priority, &workingTime, &requestTime);
mRequestArrivationQueue.emplace(name, priority, workingTime, requestTime);
}
ifstream.close();
This is JobList.txt file
Good Boy, 1, 2, 5
도서 대출, 1, 2, 13
swscanf read only first wstring(name), but it doesn't read rest integer values
There is a little error in your code and a terrible (even if common) bad practice.
The error in that the conversion format specifier is [set] and it shall not be followed with a s. Here the format string requires a s character afer the first field (which is impossible) so the conversions stops after decoding the first field. The fix is trivial, remove that offending s (and the useless quotes, thanks to #AdrianMole for his comment):
swscanf(s.data(), L"%[^,], %d, %d, %d", name, &priority, &workingTime, &requestTime);
And the terrible practice is to fail to test the return value of a scanf family function. Had you tested it, you would have immediately found that it was 1 and that only the first field had been decoded.
IMHO, unless you are a C programmer and have used the C io functions for a long time, you should better use a C++ [w]stringstream. The syntax is not easier, but error detection is better...
Related
Cppcheck 1.67 raised a portability issue in my source code at this line:
sscanf(s, "%d%*[,;.]%d", &f, &a);
This is the message I got from it:
scanf without field width limits can crash with huge input data on some versions of libc.
The original intention of the format string was to accept one of three possible limiter chars between two integers, and today - thanks to Cppcheck[1] - I see that %*[,;.] accepts even strings of limiter chars. However I doubt that my format string may cause a crash, because the unlimited part is ignored.
Is there possibly an issue with a buffer overrun? ...maybe behind the scenes?
[1]
How to get lost between farsightedness and blindness:
I tried to fix it by %1*[,;.] (after some API doc), but Cppcheck insisted in the issue, so I also tried %*1[,;.] with the same "success". Seems that I have to suppress it for now...
Congratulations on finding a bug in Cppcheck 1.67 (the current version).
You have basically three workarounds:
Just ignore the false positive.
Rework your format (assign that field, possible as you only want to match one character).
char tmp;
if(3 != sscanf(s, "%d %c%d", &f, &tmp, &a) || tmp!=',' && tmp!=';' && tmp!= '.')
goto error;
Suppress the warning directly (preferably inline-suppressions):
//cppcheck-suppress invalidscanf_libc
if(2 != sscanf(s, "%d%1*[,;.]%d", &f, &a))
goto error;
Don't forget to report the error, as "defect / false positive", so you can retire and forget that workaround as fast as possible.
When to quantify ignored pattern match in the C sscanf function?
Probably it's a good idea to always quantify (see below), but over-quantification may also distract from your intentions. In the above case, where a single separator char has to be skipped, the quantification would definitely be useful.
Is there possibly an issue with a buffer overrun? ...maybe behind the scenes?
There will be no crashes caused by your code. As to deal with the "behind the scenes" question, I experimented with large input strings. In the C library I tested, there was no internal buffer overflow. I tried the C lib that's shipped with Borland C++ 5.6.4 and found that I could not trigger a buffer overrun with large inputs (more than 400 million chars).
Surprisingly, Cppcheck was not totally wrong - there is a portability issue, but a different one:
#include <stdio.h>
#include <assert.h>
#include <sstream>
int traced_sscanf_set(const int count, const bool limited)
{
const char sep = '.';
printf("\n");
std::stringstream ss;
ss << "123" << std::string(count, sep) << "456";
std::string s = ss.str();
printf("string of size %d with %d '%c's in it\n", s.size(), count, sep);
std::stringstream fs;
fs << "%d%";
if (limited) {
fs << count;
}
fs << "*["<< sep << "]%d";
std::string fmt = fs.str();
printf("fmt: \"%s\"\n", fmt.c_str());
int a = 0;
int b = 0;
const sscanfResult = sscanf(s.c_str(), fmt.c_str(), &a, &b);
printf("sscanfResult=%d, a=%d, b=%d\n", sscanfResult, a, b);
return sscanfResult;
}
void test_sscanf()
{
assert(traced_sscanf_set(0x7fff, true)==2);
assert(traced_sscanf_set(0x7fff, false)==2);
assert(traced_sscanf_set(0x8000, true)==2);
assert(traced_sscanf_set(0x8000, false)==1);
}
The library I checked, internally limits the input consumed (and skipped) to 32767 (215-1) chars, if there is no explicitly specified limit in the format parameter.
For those who are interested, here is the trace output:
string of size 32773 with 32767 '.'s in it
fmt: "%d%32767*[.]%d"
sscanfResult=2, a=123, b=456
string of size 32773 with 32767 '.'s in it
fmt: "%d%*[.]%d"
sscanfResult=2, a=123, b=456
string of size 32774 with 32768 '.'s in it
fmt: "%d%32768*[.]%d"
sscanfResult=2, a=123, b=456
string of size 32774 with 32768 '.'s in it
fmt: "%d%*[.]%d"
sscanfResult=1, a=123, b=0
int setN, setN2;
char sign;
scanf_s("do %d %c %d", &setN, &sign, &setN2);
I'm input "do 1 + 3", for example, and program in vs fall with an error "Unhandled exception at 0x650de541 in disc_II_2_1.exe: 0xC0000005: Access violation writing location 0xc96ff41e".
P.S. code below get the same result.
scanf_s("do %d %c %d", &setN, &sign, &setN2, 8);
What am I doing wrong?
From MSDN:
Unlike scanf and wscanf, scanf_s and wscanf_s require the buffer size
to be specified for all input parameters of type c, C, s, S, or string
control sets that are enclosed in []. The buffer size in characters is
passed as an additional parameter immediately following the pointer to
the buffer or variable.
and later
In the case of characters, a single character may be read as follows:
char c;
scanf_s("%c", &c, 1);
At the end of that reference, there are also a few examples where you may see that:
the count argument should appear immediately after the corresponding input
the count argument should correspond to the maximum number of expected char (or as stated above for a single char, it should be 1)
So, in your particular case you should have:
scanf_s("do %d %c %d", &setN, &sign, 1, &setN2);
I'm writing C++ code for school in which I can only use the std library, so no boost. I need to parse a string like "14:30" and parse it into:
unsigned char hour;
unsigned char min;
We get the string as a c++ string, so no direct pointer. I tried all variations on this code:
sscanf(hour.c_str(), "%hhd[:]%hhd", &hours, &mins);
but I keep getting wrong data. What am I doing wrong.
As everyone else has mentioned, you have to use %d format specified (or %u). As for the alternative approaches, I am not a big fan of the "because C++ has feature XX it must be used" and oftentimes resort to C-level functions. Though I never use scanf()-like stuff as it got its own problems. That being said, here is how I would parse your string using strtol() with error checking:
#include <cstdio>
#include <cstdlib>
int main()
{
unsigned char hour;
unsigned char min;
const char data[] = "12:30";
char *ep;
hour = (unsigned char)strtol(data, &ep, 10);
if (!ep || *ep != ':') {
fprintf(stderr, "cannot parse hour: '%s' - wrong format\n", data);
return EXIT_FAILURE;
}
min = (unsigned char)strtol(ep+1, &ep, 10);
if (!ep || *ep != '\0') {
fprintf(stderr, "cannot parse minutes: '%s' - wrong format\n", data);
return EXIT_FAILURE;
}
printf("Hours: %u, Minutes: %u\n", hour, min);
}
Hope it helps.
Your problem is, of course, that you are using sscanf. And that
you're using some very special type for the hours and minutes, instead
of int. Since you're parsing a string of exactly 5 characters, the
simplest solution is just to ensure that all of the characters are legal
in that position, using isdigit for characters 0, 1, 3 and 4, and
comparing to ':' for character 2. Once you've done that, it's trivial
to create an std::istringstream from the string, and input into an
int, a char (which you'll ignore afterwards) and a second int. If
you want to be more flexible in the input, for example allowing things
like "9:45" as well, you can skip the initial checks, and just input
into int, char and int, then check that the char contains ':'
(and that the two int are in range).
As to why your sscanf is failing: you're asking it to match something
like "12[:]34", which is not what you're giving it. I'm not sure
whether you're trying to use "%hhd:%hhd", or if for some reason you
really do want a character class, in which case, you have to use [ as
a conversion specifier, and then ignore the input: "%hhd%*[:]%hhd".
(This would allow accepting more than one character as the separator,
but otherwise, I don't see the advantage. Also, technically at least,
using %d and then passing the address of an unsigned integral types
is not supported, %hhd must be a signed char. In practice,
however, I don't think you'll ever run into any problems for
non-negative input values less than 128.)
As mentioned by izomorphius sscanf and variants are not C++ they are C. The C++ way would be to use streams. The following works (it's not amazingly flexible but should give you an idea)
#include <iostream>
#include <string>
#include <sstream>
using namespace std;
int main(int argc, char* argv[])
{
string str = "14:30";
stringstream sstrm;
int hour,min;
sstrm << str;
sstrm >> hour;
sstrm.get(); // get colon
sstrm >> min;
cout << hour << endl;
cout << min << endl;
return 0;
}
You could also use getline to get everything upto the colon.
I would do it like this
unsigned tmp_hour, tmp_mins;
unsigned char hour, mins;
sscanf(hour.c_str(), "%u:%u", &tmp_hours, &tmp_mins);
hour = tmp_hours;
mins = tmp_mins;
Less messing around with obscure scanf options. I would add some error checking too.
My understanding is that h in %hhd is not a valid format specifier. The correct specifier for decimal integers is %d.
As R.Martinho Fernandes says in his comment, %d:%d will match two numbers separated by a colon (':').
Did you want something different?
You can always read the entire text string and parse it any way you want.
sscanf with %hhd:%hhd seems to work perfectly fine:
std::string time("14:30");
unsigned char hour, min;
sscanf(time.c_str(), "%hhd:%hhd", &hour, &min);
Note that the hh length modifier is simply to allow storing the value in an unsigned char.
However, sscanf is from the C Standard Library and there are better C++ ways to do this. A C++11 way to do this is using stoi:
std::string time("14:30");
unsigned char hour = std::stoi(time);
unsigned char min = std::stoi(time.substr(3));
In C++03, we can use stringstream instead but it's a bit of a pain if you really want it in a char:
std::stringstream stream("14:30");
unsigned int hour, min;
stream >> hour;
stream.ignore();
stream >> min;
I don't see this an option in things like sprintf().
How would I convert the letter F to 255? Basically the reverse operation of conversion using the %x format in sprintf?
I am assuming this is something simple I'm missing.
char const* data = "F";
int num = int(strtol(data, 0, 16));
Look up strtol and boost::lexical_cast for more details and options.
Use the %x format in sscanf!
The C++ way of doing it, with streams:
#include <iomanip>
#include <iostream>
#include <sstream>
int main() {
std::string hexvalue = "FF";
int value;
// Construct an input stringstream, initialized with hexvalue
std::istringstream iss(hexvalue);
// Set the stream in hex mode, then read the value, with error handling
if (iss >> std::hex >> value) std::cout << value << std::endl;
else std::cout << "Conversion failed" << std::endl;
}
The program prints 255.
You can't get (s)printf to convert 'F' to 255 without some black magic. Printf will convert a character to other representations, but won't change its value. This might show how character conversion works:
printf("Char %c is decimal %i (0x%X)\n", 'F', 'F', 'F');
printf("The high order bits are ignored: %d: %X -> %hhX -> %c\n",
0xFFFFFF46, 0xFFFFFF46, 0xFFFFFF46, 0xFFFFFF46);
produces
Char F is decimal 70 (0x46)
The high order bits are ignored: -186: FFFFFF46 -> 46 -> F
Yeah, I know you asked about sprintf, but that won't show you anything until you do another print.
The idea is that each generic integer parameter to a printf is put on the stack (or in a register) by promotion. That means it is expanded to it's largest generic size: bytes, characters, and shorts are converted to int by sign-extending or zero padding. This keeps the parameter list on the stack in sensible state. It's a nice convention, but it probably had it's origin in the 16-bit word orientation of the stack on the PDP-11 (where it all started).
In the printf library (on the receiving end of the call), the code uses the format specifier to determine what part of the parameter (or all of it) are processed. So if the format is '%c', only 8 bits are used. Note that there may be some variation between systems on how the hex constants are 'promoted'. But if a value greater thann 255 is passed to a character conversion, the high order bits are ignored.
Imagine the following:
you read in a string with scanf() but you only need a few of the datapoints in the string.
Is there an easy way to throw away the extraneous information, without losing the ability to check if the appropriate data is there so you can still reject malformed strings easily?
example:
const char* store = "Get: 15 beer 30 coke\n";
const char* dealer= "Get: 8 heroine 5 coke\n";
const char* scream= "Get: f* beer 10 coke\n";
I want to accept the first string, but forget about the beer because beer is yuckie.
I want to reject the second and third strings because they are clearly not the appropriate lists for the 7/11;
So I was thinking about the following construction:
char* bId = new char[16];
char* cId = new char[16];
int cokes;
sscanf([string here], "Get: %d %s %d %s\n", [don't care], bId, &cokes, cId);
This way I would keep the format checking, but what would I put for [don't care] that doesn't make the compiler whine?
Of course I could just make a variable I don't use later on, but that is not the point of this question. Also checking the left and right side seperately is an obvious solution I am not looking for here.
So, is there a way to not care about but still check the type of a piece of string in scanf and friends?
Use a * as assignment suppression character after %
Example:
sscanf([string here], "Get: %*d %s %d %s\n", bId, &cokes, cId);