Read into std::string using scanf - c++

As the title said, I'm curious if there is a way to read a C++ string with scanf.
I know that I can read each char and insert it in the deserved string, but I'd want something like:
string a;
scanf("%SOMETHING", &a);
gets() also doesn't work.
Thanks in advance!

this can work
char tmp[101];
scanf("%100s", tmp);
string a = tmp;

There is no situation under which gets() is to be used! It is always wrong to use gets() and it is removed from C11 and being removed from C++14.
scanf() doens't support any C++ classes. However, you can store the result from scanf() into a std::string:
Editor's note: The following code is wrong, as explained in the comments. See the answers by Patato, tom, and Daniel Trugman for correct approaches.
std::string str(100, ' ');
if (1 == scanf("%*s", &str[0], str.size())) {
// ...
}
I'm not entirely sure about the way to specify that buffer length in scanf() and in which order the parameters go (there is a chance that the parameters &str[0] and str.size() need to be reversed and I may be missing a . in the format string). Note that the resulting std::string will contain a terminating null character and it won't have changed its size.
Of course, I would just use if (std::cin >> str) { ... } but that's a different question.

Problem explained:
You CAN populate the underlying buffer of an std::string using scanf, but(!) the managed std::string object will NOT be aware of the change.
const char *line="Daniel 1337"; // The line we're gonna parse
std::string token;
token.reserve(64); // You should always make sure the buffer is big enough
sscanf(line, "%s %*u", token.data());
std::cout << "Managed string: '" << token
<< " (size = " << token.size() << ")" << std::endl;
std::cout << "Underlying buffer: " << token.data()
<< " (size = " << strlen(token.data()) << ")" << std::endl;
Outputs:
Managed string: (size = 0)
Underlying buffer: Daniel (size = 6)
So, what happened here?
The object std::string is not aware of changes not performed through the exported, official, API.
When we write to the object through the underlying buffer, the data changes, but the string object is not aware of that.
If we were to replace the original call: token.reseve(64) with token.resize(64), a call that changes the size of the managed string, the results would've been different:
const char *line="Daniel 1337"; // The line we're gonna parse
std::string token;
token.resize(64); // You should always make sure the buffer is big enough
sscanf(line, "%s %*u", token.data());
std::cout << "Managed string: " << token
<< " (size = " << token.size() << ")" << std::endl;
std::cout << "Underlying buffer: " << token.data()
<< " (size = " << strlen(token.data()) << ")" << std::endl;
Outputs:
Managed string: Daniel (size = 64)
Underlying buffer: Daniel (size = 6)
Once again, the result is sub-optimal. The output is correct, but the size isn't.
Solution:
If you really want to make do this, follow these steps:
Call resize to make sure your buffer is big enough. Use a #define for the maximal length (see step 2 to understand why):
std::string buffer;
buffer.resize(MAX_TOKEN_LENGTH);
Use scanf while limiting the size of the scanned string using "width modifiers" and check the return value (return value is the number of tokens scanned):
#define XSTR(__x) STR(__x)
#define STR(__x) #x
...
int rv = scanf("%" XSTR(MAX_TOKEN_LENGTH) "s", &buffer[0]);
Reset the managed string size to the actual size in a safe manner:
buffer.resize(strnlen(buffer.data(), MAX_TOKEN_LENGTH));

The below snippet works
string s(100, '\0');
scanf("%s", s.c_str());

Here a version without limit of length (in case of the length of the input is unknown).
std::string read_string() {
std::string s; unsigned int uc; int c;
// ASCII code of space is 32, and all code less or equal than 32 are invisible.
// For EOF, a negative, will be large than 32 after unsigned conversion
while ((uc = (unsigned int)getchar()) <= 32u);
if (uc < 256u) s.push_back((char)uc);
while ((c = getchar()) > 32) s.push_back((char)c);
return s;
}
For performance consideration, getchar is definitely faster than scanf, and std::string::reserve could pre-allocate buffers to prevent frequent reallocation.

You can construct an std::string of an appropriate size and read into its underlying character storage:
std::string str(100, ' ');
scanf("%100s", &str[0]);
str.resize(strlen(str.c_str()));
The call to str.resize() is critical, otherwise the length of the std::string object will not be updated. Thanks to Daniel Trugman for pointing this out.
(There is no off-by-one error with the size reserved for the string versus the width passed to scanf, because since C++11 it is guaranteed that the character data of std::string is followed by a null terminator so there is room for size+1 characters.)

int n=15; // you are going to scan no more than n symbols
std::string str(n+1); //you can't scan more than string contains minus 1
scanf("%s",str.begin()); // scanf only changes content of string like it's array
str=str.c_str() //make string normal, you'll have lots of problems without this string

Related

Heap corruption after second use of strcat

This is driving me nuts because I'm not seeing what bonehead mistake I'm making here.
In the following snippet (note this is just a test snippet is from a larger method), I'm basically just attempting to copy a string that's retrieved from a SQL method, and then if the user specifies in the method an additional number of columns, append a delimiter (in this case a semi-colon) and the additional string:
//...
char** pLocalArray;
char buff[512];
//... pLocalArray is allocated
// The semicolon is replaced by a variable passed into the function, but just putting this for simplicity
char delimeterStr[2] { ';', '\0' };
for (int uCol = 0; uCol < numCols; uCol++)
{
if (uCol >= 1)
{
const char* test2 = "1704EB18-FE46-4AE4-A90F-06E42C3EE07A"; // Just a test GUID
memcpy(buff, test2, 37); // Just testing some logic, copy the string into the buffer
strcat(pLocalArray[uRow], delimeterStr); // This works just fine if I stop here
// strcat(pLocalArray[uRow], buff); // ***** If I uncomment out this line, it throws a heap exception
std::cout << "Check 3 -- Output is: " << pLocalArray[uRow] << endl; // Output: MyFirstString|MySecondString|MyThirdString;1704EB18-FE46-4AE4-A90F-06E42C3EE07A
std::memset(buff, '\0', sizeof(buff));
std::cout << "Check 4 -- Output is: " << pLocalArray[uRow] << endl; //Sanity check - MyFirstString|MySecondString|MyThirdString;1704EB18-FE46-4AE4-A90F-06E42C3EE07A
}
else
{
const char* test = "MyFirstString|MySecondString|MyThirdString";
memcpy(buff, test, 43);
pLocalArray[uRow] = _strdup(buff);
std::cout << "Check -- Output is: " << pLocalArray[uRow] << endl; // Output: MyFirstString|MySecondString|MyThirdString
std::memset(buff, '\0', sizeof(buff));
std::cout << "Check 2 -- Output is: " << pLocalArray[uRow] << endl; //Sanity check - Output: MyFirstString|MySecondString|MyThirdString
}
}
//...
However, as you can see from the comments, Its throwing an exception when I use the second strcat call. I'm not understanding why doing the strcat on the delimiter is working just fine, but appending the delimiter and then immediately appending the GUID string does not work. Can someone point out to me what I'm doing incorrectly or not taking into account?
You may be misunderstanding how the strdup function works. In the following line:
pLocalArray[uRow] = _strdup(buff);
which is called to initially allocate memory for pLocalArray[uRow], the amount of space allocated will be the actual length of the buff string, interpreted as a nul-terminated character array; this will be the length of the "MyFirstString|MySecondString|MyThirdString" literal, rather than the specified size of the buff array.
Then, when you later try to append a string to that, you are overflowing the allocated space (your first strcat only seems to work, but it is nevertheless undefined behaviour).
To allow space for up to 511 characters (plus the nul-terminator), you will need code like the following:
pLocalArray[uRow] = malloc(sizeof(buff)); // Allocate full size of "buff"
strcpy(pLocalArray[uRow], buff); // then copy the strung data

Conversion from string constant, pointers in c++

After reading several answers I have corrected my code to as follows;
int main()
{
// a pointer to char is initialized with a string literal
char Buffer[100];
cout << "Enter an initial string: " << endl;
cin >> Buffer;
cout << "Original content of Buffer:--> " << Buffer << endl;
cout << "Enter a sentence: " << endl;
cin >> Buffer;
gets(Buffer);
cout << "Now the Buffer contains:--> " << Buffer << endl;
return 0;
}
I know longer have the warning code, but now the program doesnt execute as I would like. The last part does not output my new sentance.
I know people mentioned not to use gets, but I tried using getline, obviously I cant use it as a direct replacement so I was a bit lost.
Any suggestions
You cannot read into a memory which contains string constant. Often those string constants are stored in read-only memory and even if not, they can share the constants so you would override one string for all parts of your code.
You need to copy the string into some buffer and then do whatever you want. For example:
const char *myOrigBuffer = "Dummy string";
char buffer[1024];
strcpy(buff, myOrigBuffer);
....
gets(buff);
You cannot modify string literral. Your way of coding is too much "C style".
If the original buffer content doesn't matter and you must use gets(), don't initialize your buffer :
char Buffer[100];
cout << "Enter a sentence: " << endl;
gets(Buffer);
cout << "Now the Buffer contains:--> " << endl;
cout << Buffer << endl;
Don't forget that if you input more than 100 characters (as the size of the buffer), this will also crash.
As gets is deprecated, fgets must be encouraged : it protects you from overflows. You should code this way in C-Style :
char buffer[10];
printf("Your sentence : \n");
fgets(buffer, sizeof buffer, stdin);
printf("Your sentence : %s\n", buffer);
Ouputs :
Your sentence :
1 34 6789012345
Your sentence : 1 34 6789
Nonetheless, you should consider using std::cin with std::string to make correct C++ inputs :
std::string sentence;
std::cout << "Enter a sentence (with spaces): ";
std::getline(std::cin, sentence);
std::cout << "Your sentence (with spaces): " << sentence << std::endl;
Outputs :
Enter a sentence (with spaces): My super sentence
Your sentence (with spaces): My super sentence
A string literal like "Dummy content." is logically const, since any operation that attempts to change its contents results in undefined behaviour.
The definition/initialisation
char *Buffer = "Dummy content.";
however, makes Buffer a non-const pointer to (the first character of) a string literal. That involves a conversion (from array of const char to a char *). That conversion exists in C for historical reasons so is still in C++. However, subsequently using Buffer to modify the string literal - which is what gets(Buffer) does unless the user enters no data - still gives undefined behaviour.
Your "stopped working" error is one manifestation of undefined behaviour.
Giving undefined behaviour is the reason the conversion is deprecated.
Note: gets() is more than deprecated. It has been removed from the C standard, from where it originated, completely because it is so dangerous (no way to prevent it overwriting arbitrary memory). In C++, use getline() instead. It is often not a good idea to mix C I/O function and C++ stream functions on the same input or output device (or file) anyway.
char *Buffer = "Dummy content.";
You should use pointer on const char here because "Dummy content." is not a buffer but pointer on string literal that has type "array of n const char" and static storage duration, so cannot be changed through pointer. Correct variant is:
char const* Literal = "Dummy content.";
But you cannot use it as parameter for gets
gets(Buffer);
It is bad idea and should cause write access exception or memory corruption on writing. You should pass to gets a pointer to a block of memory where received string will be stored.
This block should have enough length to store whole string, so in general gets is unsafe, check https://stackoverflow.com/a/4309845/2139056 for more info.
But as temporary test solution you can use buffer on stack:
char Buffer[256];
gets(Buffer);
or dynamic allocated buffer:
char* Buffer= new char[256];
gets(Buffer);
//and do not forget to free memory after your output operations
delete [] Buffer;

weird output when printing data of custom string (c++ newbie)

my main concern is if i am doing this safely, efficiently, and for the most part doing it right.
i need a bit of help writing my implementation of a string class. perhaps someone could help me with what i would like to know?
i am attempting to write my own string class for extended functionality and for learning purposes. i will not use this as a substitute for std::string because that could be potentially dangerous. :-P
when i use std::cout to print out the contents of my string, i get some unexpected output, and i think i know why, but i am not really sure. i narrowed it down to my assign function because any other way i store characters in the string works quite fine. here is my assign function:
void String::assign(const String &s)
{
unsigned bytes = s.length() + 1;
// if there is enough unused space for this assignment
if (res_ >= bytes)
{
strncpy(data_, s.c_str(), s.length()); // use that space
res_ -= bytes;
}
else
{
// allocate enough space for this assignment
data_ = new char[bytes];
strcpy(data_, s.c_str()); // copy over
}
len_ = s.length(); // optimize the length
}
i have a constructor that reserves a fixed amount of bytes for the char ptr to allocate and hold. it is declared like so:
explicit String(unsigned /*rbytes*/);
the res_ variable simply records the passed in amount of bytes and stores it. this is the constructor's code within string.cpp:
String::String(unsigned rbytes)
{
data_ = new char[rbytes];
len_ = 0;
res_ = rbytes;
}
i thought using this method would be a bit more efficient rather than allocating new space for the string. so i can just use whatever spaced i reserved initially when i declared a new string. here is how i am testing to see if it works:
#include <iostream>
#include "./string.hpp"
int main(int argc, char **argv)
{
winks::String s2(winks::String::to_string("hello"));
winks::String s(10);
std::cout << s2.c_str() << "\n" << std::endl;
std::cout << s.unused() << std::endl;
std::cout << s.c_str() << std::endl;
std::cout << s.length() << std::endl;
s.assign(winks::String::to_string("hello")); // Assign s to "hello".
std::cout << s.unused() << std::endl;
std::cout << s.c_str() << std::endl;
std::cout << s.length() << std::endl;
std::cout.flush();
std::cin.ignore();
return 0;
}
if you are concerned about winks::String::to_string, i am simply converting a char ptr to my string object like so:
String String::to_string(const char *c_s)
{
String temp = c_s;
return temp;
}
however, the constructor i use in this method is private, so i am forcing to_string upon myself. i have had no problems with this so far. the reason why i made this is to avoid rewriting methods for different parameters ie: char * and String
the code for the private constructor:
String::String(const char *c_s)
{
unsigned t_len = strlen(c_s);
data_ = new char[t_len + 1];
len_ = t_len;
res_ = 0;
strcpy(data_, c_s);
}
all help is greatly appreciated. if i have no supplied an efficient amount of information please notify me with what you want to know and i will gladly edit my post.
edit: the reason why i am not posting the full string.hpp and string.cpp is because it is rather large and i am not sure if you guys would like that.
You have to make a decision whether you will always store your strings internally terminated with a 0. If you don't store your strings with a terminating zero byte, your c_str function has to add one. Otherwise, it's not returning a C-string.
Your assign function doesn't 0 terminate. So either it's broken, or you didn't intend to 0 terminate. If the former, fix it. If the latter, check your c_str function to make sure it puts a 0 on the end.

snprintf of unsigned long appending a comma

I try to convert a unsigned long into a character string, appending a comma at the end of the it. When compiling and running the test code you can find below, I get the following output:
"1234," "1234"
"1234"
The test code is:
#include <cstdio>
#include <iostream>
int main () {
unsigned long c = 1234;
char ch[50];
char ch1[50];
sprintf(ch, "%lu,", c);
std::cout << "\"" << ch << "\"" << " \"" << c << "\"" << std::endl;
snprintf(ch1, 5, "%s", ch);
std::cout << "\"" << ch1 << "\"" << std::endl;
return 1;
}
As far as I understand, ch should be of length 5, 4 digits plus 1 for the comma.
Do I miss an extra plus one for the termination character?
Cheers!
The size that is passed to snprintf includes the terminating null character. Although it is not printed, it still takes space in the buffer.
You should pass strlen(ch) + 1 instead. Or even better, just sizeof(ch1) will suffice since you want to accommodate the entire result before filling the buffer.
Also make sure that the destination buffer is always of a sufficient size, equal to or greater than the size you pass to snprintf. In your particular case it can hardly happen, but in general you should keep that in mind.
From the Linux manual page:
The functions snprintf() and vsnprintf() write at most size bytes (including the trailing null byte ('\0')) to str.
So yes, you should have a length of 6 to get the comma as well.
The C++ way:
#include <string>
unsigned long int c = 1234;
std::string s = "\"" + std::to_string(c) + ",\"";
std::string t = '"' + std::to_string(c) + ',' + '"'; // alternative
As you wrote, you miss an extra space for the termination character.
The functions snprintf() and vsnprintf() write at most size bytes
(including the terminating null byte ('\0')) to str.
As several people has pointed out, you need to include enough space for the null terminator.
It's always worth checking the result returned from snprintf is what you think it should be, as well.
Lastly, I'd recommend using snprintf(buffer, sizeof(buffer), etc, etc)
Your less likely to get embarrassing results if the 2nd parameter happens to be larger than the actual space you've got available.

how to print char array in c++

how can i print a char array such i initialize and then concatenate to another char array? Please see code below
int main () {
char dest[1020];
char source[7]="baby";
cout <<"source: " <<source <<endl;
cout <<"return value: "<<strcat(dest, source) <<endl;
cout << "pointer pass: "<<dest <<endl;
return 0;
}
this is the output
source: baby
return value: v����baby
pointer pass: v����baby
basically i would like to see the output print
source: baby
return value: baby
pointer pass: baby
You haven't initialized dest
char dest[1020] = ""; //should fix it
You were just lucky that it so happened that the 6th (random) value in dest was 0. If it was the 1000th character, your return value would be much longer. If it were greater than 1024 then you'd get undefined behavior.
Strings as char arrays must be delimited with 0. Otherwise there's no telling where they end. You could alternatively say that the string ends at its zeroth character by explicitly setting it to 0;
char dest[1020];
dest[0] = 0;
Or you could initialize your whole array with 0's
char dest[1024] = {};
And since your question is tagged C++ I cannot but note that in C++ we use std::strings which save you from a lot of headache. Operator + can be used to concatenate two std::strings
Don't use char[]. If you write:
std::string dest;
std::string source( "baby" )
// ...
dest += source;
, you'll have no problems. (In fact, your problem is due to the fact
that strcat requires a '\0' terminated string as its first argument,
and you're giving it random data. Which is undefined behavior.)
your dest array isn't initialized. so strcat tries to append source to the end of dest wich is determined by a trailing '\0' character, but it's undefined where an uninitialized array might end... (if it does at all...)
so you end up printing more or less random characters until accidentially a '\0' character occurs...
Try this
#include <iostream>
using namespace std;
int main()
{
char dest[1020];
memset (dest, 0, sizeof(dest));
char source[7] = "baby";
cout << "Source: " << source << endl;
cout << "return value: " << strcat_s(dest, source) << endl;
cout << "pointer pass: " << dest << endl;
getchar();
return 0;
}
Did using VS 2010 Express.
clear memory using memset as soon as you declare dest, it's more secure. Also if you are using VC++, use strcat_s() instead of strcat().