how to avoid to use memcpy to create a string - c++

My programmer has lots of memcpy. I want to avoid them.
I want to get a struct like boost::string_ref.
I want to know.
uint32_t len = 20;
char *p = new char[len];
memset(p, 0x00, len)
memcpy(p, "aaaa", 4);
string str(p, 4);// whether is use memcpy or not ?
const string str2(p, 4); //whether is use memcpy or not
//if it used memcpy, how to avoid memcpy ?
string_ref->string
string_view->string
string->string_ref
string->string_view
char* ->string
string->string
please tell me how to judge whether function used memcpy ?

Use an std::string_view to avoid copying the string.
#include <string_view>
// ...
const string_view strview(p, 4);
Keep in mind that string views are not guaranteed to be null-terminated, so be careful when using them in APIs that expect a null-terminated char*. Also, when the string they are viewing goes out of scope, the string view itself becomes invalid (just like a pointer would.)

Related

String to const char conversion using c_str() or toCharArray()?

I want to know more about programming and after a bit of googling I found how to convert a string to a const char.
String text1;
What I do not understand is why c_str() works,
const char *text2 = text1.c_str();
contrary to toCharArray()?
const char *text2 = text1.toCharArray();
or
const char text2 = text1.toCharArray();
The latter is more logical to me as I want to convert a string to a char, and then turn it into a const char. But that doesn't work because one is a string, the other is a char. The former, as I understand, converts the string to a C-type string and then turns it into a const char. Here, the string suddenly isn't an issue anymore oO
.
a) Why does it need a C-type string conversion and why does it work only then?
b) Why is the pointer needed?
c) Why does a simple toCharArray() not work?
.
Or do I do something terribly wrong?
Thanks heaps.
I am using PlatformIO with Arduino platform.
If you need to modify the returned c-style string in any way, or have it persist after you modify the original String, you should use toCharArray.
If you only need a null-terminated c-style string to pass as a read-only parameter to a function, use c_str.
Arduino reference for String.toCharArray()
Arduino reference for String.c_str()
The interface (and implementation) of toCharArray is shown below, from source
void toCharArray(char *buf, unsigned int bufsize, unsigned int index=0) const
{ getBytes((unsigned char *)buf, bufsize, index); }
So your first issue is that you're trying to use it incorrectly. toCharArray will COPY the underlying characters of your String into a buffer that you provide. This must be extra space that you have allocated, either in a buffer on the stack, or in some other writable area of memory. You would do it like this.
String str = "I am a string!";
char buf[5];
str.toCharArray(buf, 5);
// buf is now "I am\0"
// or you can start at a later index, here index 5
str.toCharArray(buf, 5, 5);
// buf is now "a st\0"
// we can also change characters in the buffer
buf[1] = 'X';
// buf is now "aXst\0"
// modifying the original String does not invalidate the buffer
str = "Je suis une chaine!";
// buf is still "aXst\0"
This allows you to copy a string partially, or at a later index, or anything you want. Most importantly, this array you copy into is mutable. We can change it, and since it's a copy, it doesn't affect the original String we copied it from. This flexibility comes with a cost. First, we have to have a large enough buffer, which may not be known at compile time, and takes up memory. Second, that copying takes time to do.
But what if we're calling a function that just wants to read a c-style string as input? It doesn't need to modify it at all?
That's where c_str() comes in. The String object has an underlying c-string type array (yes, null terminator and all). c_str() simply returns a const char* to this array. We make it const so that we don't accidentally change it. An object's underlying data should not be changed by random functions outside of its control.
This is the ENTIRE code for c_str():
const char* c_str() const { return buffer; }
You already know how to use it, but to illustrate a difference:
String str = "I am another string!";
const char* c = str.c_str();
// c[1] = 'X'; // error, cannot modify a const object
// modifying the original string may reallocate the underlying buffer
str = "Je suis une autre chaine!";
// dereferencing c now may point to invalid memory
Since c_str() simply returns the underlying data pointer, it's fast. But we don't want other functions to be allowed to modify this data, so it's const.

How can I transfer string to char* (not const char*)

I wanna do something like:
string result;
char* a[100];
a[0]=result;
it seems that result.c_str() has to be const char*. Is there any way to do this?
You can take the address of the first character in the string.
a[0] = &result[0];
This is guaranteed to work in C++11. (The internal string representation must be contiguous and null-terminated like a C-style string)
In C++03 these guarantees do not exist, but all common implementations will work.
string result;
char a[100] = {0};
strncpy(a, result.c_str(), sizeof(a) - 1);
There is a member function (method) called "copy" to have this done.
but you need create the buffer first.
like this
string result;
char* a[100];
a[0] = new char[result.length() + 1];
result.copy(a[0], result.length(), 0);
a[0][result.length()] = '\0';
(references: http://www.cplusplus.com/reference/string/basic_string/copy/ )
by the way, I wonder if you means
string result;
char a[100];
You can do:
char a[100];
::strncpy(a, result.c_str(), 100);
Be careful of null termination.
The old fashioned way:
#include <string.h>
a[0] = strdup(result.c_str()); // allocates memory for a new string and copies it over
[...]
free(a[0]); // don't forget this or you leak memory!
If you really, truly can't avoid doing this, you shouldn't throw away all that C++ offers, and descend to using raw arrays and horrible functions like strncpy.
One reasonable possibility would be to copy the data from the string to a vector:
char const *temp = result.c_str();
std::vector<char> a(temp, temp+result.size()+1);
You can usually leave the data in the string though -- if you need a non-const pointer to the string's data, you can use &result[0].

std::string.c_str() has different value than std::string?

I have been working with C++ strings and trying to load char * strings into std::string by using C functions such as strcpy(). Since strcpy() takes char * as a parameter, I have to cast it which goes something like this:
std::string destination;
unsigned char *source;
strcpy((char*)destination.c_str(), (char*)source);
The code works fine and when I run the program in a debugger, the value of *source is stored in destination, but for some odd reason it won't print out with the statement
std::cout << destination;
I noticed that if I use
std::cout << destination.c_str();
The value prints out correctly and all is well. Why does this happen? Is there a better method of copying an unsigned char* or char* into a std::string (stringstreams?) This seems to only happen when I specify the string as foo.c_str() in a copying operation.
Edit: To answer the question "why would you do this?", I am using strcpy() as a plain example. There are other times that it's more complex than assignment. For example, having to copy only X amount of string A into string B using strncpy() or passing a std::string to a function from a C library that takes a char * as a parameter for a buffer.
Here's what you want
std::string destination = source;
What you're doing is wrong on so many levels... you're writing over the inner representation of a std::string... I mean... not cool man... it's much more complex than that, arrays being resized, read-only memory... the works.
This is not a good idea at all for two reasons:
destination.c_str() is a const pointer and casting away it's const and writing to it is undefined behavior.
You haven't set the size of the string, meaning that it won't even necessealy have a large enough buffer to hold the string which is likely to cause an access violation.
std::string has a constructor which allows it to be constructed from a char* so simply write:
std::string destination = source
Well what you are doing is undefined behavior. Your c_str() returns a const char * and is not meant to be assigned to. Why not use the defined constructor or assignment operator.
std::string defines an implicit conversion from const char* to std::string... so use that.
You decided to cast away an error as c_str() returns a const char*, i.e., it does not allow for writing to its underlying buffer. You did everything you could to get around that and it didn't work (you shouldn't be surprised at this).
c_str() returns a const char* for good reason. You have no idea if this pointer points to the string's underlying buffer. You have no idea if this pointer points to a memory block large enough to hold your new string. The library is using its interface to tell you exactly how the return value of c_str() should be used and you're ignoring that completely.
Do not do what you are doing!!!
I repeat!
DO NOT DO WHAT YOU ARE DOING!!!
That it seems to sort of work when you do some weird things is a consequence of how the string class was implemented. You are almost certainly writing in memory you shouldn't be and a bunch of other bogus stuff.
When you need to interact with a C function that writes to a buffer there's two basic methods:
std::string read_from_sock(int sock) {
char buffer[1024] = "";
int recv = read(sock, buffer, 1024);
if (recv > 0) {
return std::string(buffer, buffer + recv);
}
return std::string();
}
Or you might try the peek method:
std::string read_from_sock(int sock) {
int recv = read(sock, 0, 0, MSG_PEEK);
if (recv > 0) {
std::vector<char> buf(recv);
recv = read(sock, &buf[0], recv, 0);
return std::string(buf.begin(), buf.end());
}
return std::string();
}
Of course, these are not very robust versions...but they illustrate the point.
First you should note that the value returned by c_str is a const char* and must not be modified. Actually it even does not have to point to the internal buffer of string.
In response to your edit:
having to copy only X amount of string A into string B using strncpy()
If string A is a char array, and string B is std::string, and strlen(A) >= X, then you can do this:
B.assign(A, A + X);
passing a std::string to a function from a C library that takes a char
* as a parameter for a buffer
If the parameter is actually const char *, you can use c_str() for that. But if it is just plain char *, and you are using a C++11 compliant compiler, then you can do the following:
c_function(&B[0]);
However, you need to ensure that there is room in the string for the data(same as if you were using a plain c-string), which you can do with a call to the resize() function. If the function writes an unspecified amount of characters to the string as a null-terminated c-string, then you will probably want to truncate the string afterward, like this:
B.resize(B.find('\0'));
The reason you can safely do this in a C++11 compiler and not a C++03 compiler is that in C++03, strings were not guaranteed by the standard to be contiguous, but in C++11, they are. If you want the guarantee in C++03, then you can use std::vector<char> instead.

Can I get a non-const C string back from a C++ string?

Const-correctness in C++ is still giving me headaches. In working with some old C code, I find myself needing to assign turn a C++ string object into a C string and assign it to a variable. However, the variable is a char * and c_str() returns a const char []. Is there a good way to get around this without having to roll my own function to do it?
edit: I am also trying to avoid calling new. I will gladly trade slightly more complicated code for less memory leaks.
C++17 and newer:
foo(s.data(), s.size());
C++11, C++14:
foo(&s[0], s.size());
However this needs a note of caution: The result of &s[0]/s.data()/s.c_str() is only guaranteed to be valid until any member function is invoked that might change the string. So you should not store the result of these operations anywhere. The safest is to be done with them at the end of the full expression, as my examples do.
Pre C++-11 answer:
Since for to me inexplicable reasons nobody answered this the way I do now, and since other questions are now being closed pointing to this one, I'll add this here, even though coming a year too late will mean that it hangs at the very bottom of the pile...
With C++03, std::string isn't guaranteed to store its characters in a contiguous piece of memory, and the result of c_str() doesn't need to point to the string's internal buffer, so the only way guaranteed to work is this:
std::vector<char> buffer(s.begin(), s.end());
foo(&buffer[0], buffer.size());
s.assign(buffer.begin(), buffer.end());
This is no longer true in C++11.
There is an important distinction you need to make here: is the char* to which you wish to assign this "morally constant"? That is, is casting away const-ness just a technicality, and you really will still treat the string as a const? In that case, you can use a cast - either C-style or a C++-style const_cast. As long as you (and anyone else who ever maintains this code) have the discipline to treat that char* as a const char*, you'll be fine, but the compiler will no longer be watching your back, so if you ever treat it as a non-const you may be modifying a buffer that something else in your code relies upon.
If your char* is going to be treated as non-const, and you intend to modify what it points to, you must copy the returned string, not cast away its const-ness.
I guess there is always strcpy.
Or use char* strings in the parts of your C++ code that must interface with the old stuff.
Or refactor the existing code to compile with the C++ compiler and then to use std:string.
There's always const_cast...
std::string s("hello world");
char *p = const_cast<char *>(s.c_str());
Of course, that's basically subverting the type system, but sometimes it's necessary when integrating with older code.
If you can afford extra allocation, instead of a recommended strcpy I would consider using std::vector<char> like this:
// suppose you have your string:
std::string some_string("hello world");
// you can make a vector from it like this:
std::vector<char> some_buffer(some_string.begin(), some_string.end());
// suppose your C function is declared like this:
// some_c_function(char *buffer);
// you can just pass this vector to it like this:
some_c_function(&some_buffer[0]);
// if that function wants a buffer size as well,
// just give it some_buffer.size()
To me this is a bit more of a C++ way than strcpy. Take a look at Meyers' Effective STL Item 16 for a much nicer explanation than I could ever provide.
You can use the copy method:
len = myStr.copy(cStr, myStr.length());
cStr[len] = '\0';
Where myStr is your C++ string and cStr a char * with at least myStr.length()+1 size. Also, len is of type size_t and is needed, because copy doesn't null-terminate cStr.
Just use const_cast<char*>(str.data())
Do not feel bad or weird about it, it's perfectly good style to do this.
It's guaranteed to work in C++11. The fact that it's const qualified at all is arguably a mistake by the original standard before it; in C++03 it was possible to implement string as a discontinuous list of memory, but no one ever did it. There is not a compiler on earth that implements string as anything other than a contiguous block of memory, so feel free to treat it as such with complete confidence.
If you know that the std::string is not going to change, a C-style cast will work.
std::string s("hello");
char *p = (char *)s.c_str();
Of course, p is pointing to some buffer managed by the std::string. If the std::string goes out of scope or the buffer is changed (i.e., written to), p will probably be invalid.
The safest thing to do would be to copy the string if refactoring the code is out of the question.
std::string vString;
vString.resize(256); // allocate some space, up to you
char* vStringPtr(&vString.front());
// assign the value to the string (by using a function that copies the value).
// don't exceed vString.size() here!
// now make sure you erase the extra capacity after the first encountered \0.
vString.erase(std::find(vString.begin(), vString.end(), 0), vString.end());
// and here you have the C++ string with the proper value and bounds.
This is how you turn a C++ string to a C string. But make sure you know what you're doing, as it's really easy to step out of bounds using raw string functions. There are moments when this is necessary.
If c_str() is returning to you a copy of the string object internal buffer, you can just use const_cast<>.
However, if c_str() is giving you direct access tot he string object internal buffer, make an explicit copy, instead of removing the const.
Since c_str() gives you direct const access to the data structure, you probably shouldn't cast it. The simplest way to do it without having to preallocate a buffer is to just use strdup.
char* tmpptr;
tmpptr = strdup(myStringVar.c_str();
oldfunction(tmpptr);
free tmpptr;
It's quick, easy, and correct.
In CPP, if you want a char * from a string.c_str()
(to give it for example to a function that only takes a char *),
you can cast it to char * directly to lose the const from .c_str()
Example:
launchGame((char *) string.c_str());
C++17 adds a char* string::data() noexcept overload. So if your string object isn't const, the pointer returned by data() isn't either and you can use that.
Is it really that difficult to do yourself?
#include <string>
#include <cstring>
char *convert(std::string str)
{
size_t len = str.length();
char *buf = new char[len + 1];
memcpy(buf, str.data(), len);
buf[len] = '\0';
return buf;
}
char *convert(std::string str, char *buf, size_t len)
{
memcpy(buf, str.data(), len - 1);
buf[len - 1] = '\0';
return buf;
}
// A crazy template solution to avoid passing in the array length
// but loses the ability to pass in a dynamically allocated buffer
template <size_t len>
char *convert(std::string str, char (&buf)[len])
{
memcpy(buf, str.data(), len - 1);
buf[len - 1] = '\0';
return buf;
}
Usage:
std::string str = "Hello";
// Use buffer we've allocated
char buf[10];
convert(str, buf);
// Use buffer allocated for us
char *buf = convert(str);
delete [] buf;
// Use dynamic buffer of known length
buf = new char[10];
convert(str, buf, 10);
delete [] buf;

Get bytes from std::string in C++

I'm working in a C++ unmanaged project.
I need to know how can I take a string like this "some data to encrypt" and get a byte[] array which I'm gonna use as the source for Encrypt.
In C# I do
for (int i = 0; i < text.Length; i++)
buffer[i] = (byte)text[i];
What I need to know is how to do the same but using unmanaged C++.
Thanks!
If you just need read-only access, then c_str() will do it:
char const *c = myString.c_str();
If you need read/write access, then you can copy the string into a vector. vectors manage dynamic memory for you. You don't have to mess with allocation/deallocation then:
std::vector<char> bytes(myString.begin(), myString.end());
bytes.push_back('\0');
char *c = &bytes[0];
std::string::data would seem to be sufficient and most efficient. If you want to have non-const memory to manipulate (strange for encryption) you can copy the data to a buffer using memcpy:
unsigned char buffer[mystring.length()];
memcpy(buffer, mystring.data(), mystring.length());
STL fanboys would encourage you to use std::copy instead:
std::copy(mystring.begin(), mystring.end(), buffer);
but there really isn't much of an upside to this. If you need null termination use std::string::c_str() and the various string duplication techniques others have provided, but I'd generally avoid that and just query for the length. Particularly with cryptography you just know somebody is going to try to break it by shoving nulls in to it, and using std::string::data() discourages you from lazily making assumptions about the underlying bits in the string.
Normally, encryption functions take
encrypt(const void *ptr, size_t bufferSize);
as arguments. You can pass c_str and length directly:
encrypt(strng.c_str(), strng.length());
This way, extra space is allocated or wasted.
In C++17 and later you can use std::byte to represent actual byte data. I would recommend something like this:
std::vector<std::byte> to_bytes(std::string const& s)
{
std::vector<std::byte> bytes;
bytes.reserve(std::size(s));
std::transform(std::begin(s), std::end(s), std::back_inserter(bytes), [](char c){
return std::byte(c);
});
return bytes;
}
From a std::string you can use the c_ptr() method if you want to get at the char_t buffer pointer.
It looks like you just want copy the characters of the string into a new buffer. I would simply use the std::string::copy function:
length = str.copy( buffer, str.size() );
If you just need to read the data.
encrypt(str.data(),str.size());
If you need a read/write copy of the data put it into a vector. (Don;t dynamically allocate space that's the job of vector).
std::vector<byte> source(str.begin(),str.end());
encrypt(&source[0],source.size());
Of course we are all assuming that byte is a char!!!
If this is just plain vanilla C, then:
strcpy(buffer, text.c_str());
Assuming that buffer is allocated and large enough to hold the contents of 'text', which is the assumption in your original code.
If encrypt() takes a 'const char *' then you can use
encrypt(text.c_str())
and you do not need to copy the string.
You might go with range-based for loop, which would look like this:
std::vector<std::byte> getByteArray(const string& str)
{
std::vector<std::byte> buffer;
for (char str_char : str)
buffer.push_back(std::byte(str_char));
return buffer;
}
I dont think you want to use the c# code you have there. They provide System.Text.Encoding.ASCII(also UTF-*)
string str = "some text;
byte[] bytes = System.Text.Encoding.ASCII.GetBytes(str);
your problems stem from ignoring the encoding in c# not your c++ code