A risk of string copy - c++

I just study c/c++ string using MS'Visual Studio 2013, And find a question about function strcpy_s.
I find that even one don't give a dest char* enough memory,you can use strcpy_s successfully.Is there any risk using like this?
code:
const char* s5 = "hello!";
char* cs6 = new char[1];
strcpy_s(cs6, strlen(s5) + 1, s5);
cs6[2] = 'g';
string s7(cs6);
cout << "----------cs6------------" << endl;
cout << s7 << endl;
display in console:
----------cs6------------
heglo!

And find a question about function strcpy_s. I find that even one don't give a dest char* enough memory,you can use strcpy_s successfully.Is there any risk ...?
Not if you use strcpy_s correctly, like this:
int buffer_size = 1; // this is silly since null terminated buffer
// of size 1 can only fit a string of length 0
char* cs6 = new char[buffer_size];
strcpy_s(cs6, buffer_size, s5);
cs6[buffer_size - 1] = '\0';
The code above is perfectly safe. However:
Is there any risk using like this?
const char* s5 = "hello!";
char* cs6 = new char[1];
strcpy_s(cs6, strlen(s5) + 1, s5);
Yes, there is risk. The behaviour is undefined. Best case scenario: The program crashes. Worse scenario: A blackhat hacker exploits the behaviour and steals your data that you were bound by law to not leak.

Related

Copy a part of an std::string in a char* pointer

Let's suppose I've this code snippet in C++
char* str;
std::string data = "This is a string.";
I need to copy the string data (except the first and the last characters) in str.
My solution that seems to work is creating a substring and then performing the std::copy operation like this
std::string substring = data.substr(1, size - 2);
str = new char[size - 1];
std::copy(substring.begin(), substring.end(), str);
str[size - 2] = '\0';
But maybe this is a bit overkilling because I create a new string. Is there a simpler way to achieve this goal? Maybe working with offets in the std:copy calls?
Thanks
As mentioned above, you should consider keeping the sub-string as a std::string and use c_str() method when you need to access the underlying chars.
However-
If you must create the new string as a dynamic char array via new you can use the code below.
It checks whether data is long enough, and if so allocates memory for str and uses std::copy similarly to your code, but with adapted iterators.
Note: there is no need to allocate a temporary std::string for the sub-string.
The Code:
#include <string>
#include <iostream>
int main()
{
std::string data = "This is a string.";
auto len = data.length();
char* str = nullptr;
if (len > 2)
{
auto new_len = len - 2;
str = new char[new_len+1]; // add 1 for zero termination
std::copy(data.begin() + 1, data.end() - 1, str); // copy from 2nd char till one before the last
str[new_len] = '\0'; // add zero termination
std::cout << str << std::endl;
// ... use str
delete[] str; // must be released eventually
}
}
Output:
his is a string
There is:
int length = data.length() - 1;
memcpy(str, data.c_str() + 1, length);
str[length] = 0;
This will copy the string in data, starting at position [1] (instead of [0]) and keep copying until length() - 1 bytes have been copied. (-1 because you want to omit the first character).
The final character then gets overwritten with the terminating \0, finalizing the string and disposing of the final character.
Of course this approach will cause problems if the string does not have at least 1 character, so you should check for that beforehand.

Convert from vector<unsigned char> to char* includes garbage data

I'm trying to base64 decode a string, then convert that value to a char array for later use. The decode works fine, but then I get garbage data when converting.
Here's the code I have so far:
std::string encodedData = "VGVzdFN0cmluZw=="; //"TestString"
std::vector<BYTE> decodedData = base64_decode(encodedData);
char* decodedChar;
decodedChar = new char[decodedData.size() +1]; // +1 for the final 0
decodedChar[decodedData.size() + 1] = 0; // terminate the string
for (size_t i = 0; i < decodedData.size(); ++i) {
decodedChar[i] = decodedData[i];
}
vector<BYTE> is a typedef of unsigned char BYTE, as taken from this SO answer. The base64 code is also from this answer (the most upvoted answer, not the accepted answer).
When I run this code, I get the following value in the VisualStudio Text Visualiser:
TestStringÍ
I've also tried other conversion methods, such as:
char* decodedChar = reinterpret_cast< char *>(&decodedData[0]);
Which gives the following:
TestStringÍÍÍýýýýÝÝÝÝÝÝÝ*b4d“
Why am I getting the garbage data at the end of the string? What am i doing wrong?
EDIT: clarified which answer in the linked question I'm using
char* decodedChar;
decodedChar = new char[decodedData.size() +1]; // +1 for the final 0
Why would you manually allocate a buffer and then copy to it when you have std::string available that does this for you?
Just do:
std::string encodedData = "VGVzdFN0cmluZw=="; //"TestString"
std::vector<BYTE> decodedData = base64_decode(encodedData);
std::string decodedString { decodedData.begin(), decodedData.end() };
std::cout << decodedString << '\n';
If you need a char * out of this, just use .c_str()
const char* cstr = decodedString.c_str();
If you need to pass this on to a function that takes char* as input, for example:
void someFunc(char* data);
//...
//call site
someFunc( &decodedString[0] );
We have a TON of functions and abstractions and containers in C++ that were made to improve upon the C language, and so that programmers wouldn't have to write things by hand and make same mistakes every time they code. It would be best if we use those functionalities wherever we can to avoid raw loops or to do simple modifications like this.
You are writing beyond the last element of your allocated array, which can cause literally anything to happen (according to the C++ standard). You need decodedChar[decodedData.size()] = 0;

Ustring error (during printing)

I want to parse UTF-8 file to ustring, I read this file in str.
There is an error:
terminate called after throwing an instance of 'Glib::ConvertError'.
What should I do?
char* cs = (char*) malloc(sizeof(char) * str.length());
strcpy(cs, str.c_str());
ustring res;
while (strlen(cs) > 0) {
gunichar ch = g_utf8_get_char(cs);
res.push_back(ch);
cs = g_utf8_next_char(cs);
}
wofstream wout("output");
cout << res << endl;
This looks very wrong:
char* cs = (char*) malloc(sizeof(str.c_str()));
as sizeof(str.c_str()) is bound to give you some small number like 4 or 8 (whichever is the size of a pointer on your machine, as the result of str.c_str().
Of course, it doesn't REALLY matter, since the next line, you are leaking the memory you just allocated incorrectly:
cs = const_cast<char*> (str.c_str());
I'm far from convinced that you need the const_cast<char *> (it is certainly WRONG to do this, since modifying the string inside a string is undefined behaviour).

Simple C++ char array encryption function - Segment fault

As always, problems with the pointers. I am trying to create a very simple "encryption/decryption" function for char arrays. Yes, I know I can use strings, but I want to improve my knowledge about pointers and make use of simple bytes to achieve a simple task.
So, I created a simple struct like this:
struct text {
char* value;
int size;
}
And I created this simple function:
text encrypt(text decrypted) {
char key = 'X';
for (int i=0; i<decrypted.size; i++) {
decrypted.value[i] = decrypted.value[i] ^ (key + i) % 255);
}
return decrypted;
}
At this point, an experienced C++ programmer should have spot the problem, I think. Anyway, I call this function like this:
...
text mytext;
mytext.value = new char[5];
mytext.value = "Hello";
mytext.size = 5;
mytext = encrypt(mytext);
...
I get, like always, a 'Segmentation fault(core dumped)' error. This is Linux, and, of course, g++. What have I done, again? Thanks!
mytext.value = new char[5];
mytext.value = "Hello";
on the second line, you throw away the (handle to the) allocated memory, leaking it, and let mytext.value point to a string literal. Modifying a string literal is undefined behaviour, and usually crashes, since string literals are often stored in a read-only memory segment.
If you insist on using a char*, you should strncpy the string into the allocated memory (but be aware that it won't be 0-terminated then, you should better allocate a new char[6] and copy also the 0-terminator).
Or let decrypt create a new text that it returns:
text encrypt(text decrypted) {
char key = 'X';
text encrypted;
encrypted.size = decrypted.size;
encrypted.value = new char[encrypted.size];
for (int i=0; i<decrypted.size; i++) {
encrypted.value[i] = decrypted.value[i] ^ (key + i) % 255;
}
// What about 0-terminators?
return encrypted;
}
But, as you're using C++, std::string would be a better choice here.
You're modifying string literals:
mytext.value = "Hello";
after this, you can no longer legally mutate what mytext.value points to, you can only re-assign the pointer.
The fix: use std::string

Why is my char* writable and sometimes read only in C++

I have had really big problems understand the char* lately.
Let's say I made a recursive function to revert a char* but depending on how I initialize it I get some access violations, and in my C++ primer I didn't find anything giving me the right path to understand so I am seeking your help.
CASE 1
First case where I got access violation when trying to swap letters around:
char * bob = "hello";
CASE 2 Then I tried this to get it work
char * bob = new char[5];
bob[0] = 'h';
bob[1] = 'e';
bob[2] = 'l';
bob[3] = 'l';
bob[4] = 'o';
CASE 3 But then when I did a cout I got some random crap at the end so I changed it for
char * bob = new char[6];
bob[0] = 'h';
bob[1] = 'e';
bob[2] = 'l';
bob[3] = 'l';
bob[4] = 'o';
bob[5] = '\0';
CASE 4 That worked so I told myself why wouldn't this work then
char * bob = new char[6];
bob = "hello\0";
CASE 5 and it failed, I have also read somewhere that you could do something like
char* bob[];
Then add something to that.
My question is why do some fail and other not, and what is the best way to do it?
The key is that some of these pointers are pointing at allocated memory (which is read/write) and some of them are pointing at string constants. String constants are stored in a different location than the allocated memory, and can't be changed. Well most of the time. Often vulnerabilities in systems are the result of code or constants being changed, but that is another story.
In any case, the key is the use of the new keyword, this is allocating space in read/write memory and thus you can change that memory.
This statement is wrong
char * bob = new char[6];
bob = "hello\0";
because you are changing the pointer not copying the data. What you want is this:
char * bob = new char[6];
strcpy(bob,"hello");
or
strncpy(bob,"hello",6);
You don't need the nul here because a string constant "hello" will have the null placed by the compiler.
char * bob = "hello";
This actually translated to:
const char __hello[] = "hello";
char * bob = (char*) __hello;
You can't change it, because if you'd written:
char * bob = "hello";
char * sam = "hello";
It could be translated to:
const char __hello[] = "hello";
char * bob = (char*) __hello;
char * sam = (char*) __hello;
now, when you write:
char * bob = new char[6];
bob = "hello\0";
First you assign one value to bob, then you assign a new value to it. What you really want to do here is:
char * bob = new char[6];
strcpy(bob, "hello");
You should always use char const* for pointers to string literals (stuff in double quotes). Even though the standard allows char* as well, it does not allow writing to the string literal. GCC gives a compile warning for assigning a literal address into char*, but apparently some other compilers don't.
Edit: The question was retagged as C++ instead of C which was originally there but re-tagged....
Ok. You have got a couple of things mixed up...
new is used by C++, not C.
Case #1. That is declaring a pointer to char. You should be able to manipulate the string...can you show the code in what you did to do swapping characters.
Case #2/#3. That you got random crap, and discovered that a nul terminator i.e. '\0'...occupies every single string you'll encounter for the duration of C/C++, possibly for the rest of your life...
+-+-+-+-+-+--+
|H|e|l|l|o|\0|
+-+-+-+-+-+--+
^
|
Nul Terminator
Case #4 did not work as you need to use a strcpy to do that job, you cannot simply assign a string like that after calling new, when you declare a string char *s = "foo"; that is initialized at compile time. But when you do it this way, char *s = new char[6]; strcpy(s, "hello"); that gets copied into the pointer variable s.
You will eventually discover that this pointer to a memory block occupied by s will easily get over-written which will induce a fit of conniptions as you realize that you have to be careful to prevent buffer overflows...Remember Case #3 in relation to nul terminator...don't forget that, really, that string's length is 6, not 5 as we're taking into account of the nul terminator.
Case #5. That is declaring a pointer to array of type char, i.e. a multi-dimensional array, think of it like this
*(bob + 0) = "foo";
*(bob + 1) = "bar";
I know there is a lot to digest...but feel free to post any further thoughts... :) And best of luck in learning...