Copy a part of an std::string in a char* pointer - c++

Let's suppose I've this code snippet in C++
char* str;
std::string data = "This is a string.";
I need to copy the string data (except the first and the last characters) in str.
My solution that seems to work is creating a substring and then performing the std::copy operation like this
std::string substring = data.substr(1, size - 2);
str = new char[size - 1];
std::copy(substring.begin(), substring.end(), str);
str[size - 2] = '\0';
But maybe this is a bit overkilling because I create a new string. Is there a simpler way to achieve this goal? Maybe working with offets in the std:copy calls?
Thanks

As mentioned above, you should consider keeping the sub-string as a std::string and use c_str() method when you need to access the underlying chars.
However-
If you must create the new string as a dynamic char array via new you can use the code below.
It checks whether data is long enough, and if so allocates memory for str and uses std::copy similarly to your code, but with adapted iterators.
Note: there is no need to allocate a temporary std::string for the sub-string.
The Code:
#include <string>
#include <iostream>
int main()
{
std::string data = "This is a string.";
auto len = data.length();
char* str = nullptr;
if (len > 2)
{
auto new_len = len - 2;
str = new char[new_len+1]; // add 1 for zero termination
std::copy(data.begin() + 1, data.end() - 1, str); // copy from 2nd char till one before the last
str[new_len] = '\0'; // add zero termination
std::cout << str << std::endl;
// ... use str
delete[] str; // must be released eventually
}
}
Output:
his is a string

There is:
int length = data.length() - 1;
memcpy(str, data.c_str() + 1, length);
str[length] = 0;
This will copy the string in data, starting at position [1] (instead of [0]) and keep copying until length() - 1 bytes have been copied. (-1 because you want to omit the first character).
The final character then gets overwritten with the terminating \0, finalizing the string and disposing of the final character.
Of course this approach will cause problems if the string does not have at least 1 character, so you should check for that beforehand.

Related

How to convert a std::string which contains '\0' to a char* array?

I have a string like,
string str="aaa\0bbb";
and I want to copy the value of this string to a char* variable. I tried the following methods but none of them worked.
char *c=new char[7];
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
strcpy(c,str.data()); // c="aaa"
str.copy(c,7); // c="aaa"
How can I copy that string to a char* variable without loosing any data?.
You can do it the following way
#include <iostream>
#include <string>
#include <cstring>
int main()
{
std::string s( "aaa\0bbb", 7 );
char *p = new char[s.size() + 1];
std::memcpy( p, s.c_str(), s.size() );
p[s.size()] = '\0';
size_t n = std::strlen( p );
std::cout << p << std::endl;
std::cout << p + n + 1 << std::endl;
}
The program output is
aaa
bbb
You need to keep somewhere in the program the allocated memory size for the character array equal to s.size() + 1.
If there is no need to keep the "second part" of the object as a string then you may allocate memory of the size s.size() and not append it with the terminating zero.
In fact these methods used by you
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
str.copy(c,7); // c="aaa"
are correct. They copy exactly 7 characters provided that you are not going to append the resulted array with the terminating zero. The problem is that you are trying to output the resulted character array as a string and the used operators output only the characters before the embedded zero character.
Your string consists of 3 characters. You may try to use
using namespace std::literals;
string str="aaa\0bbb"s;
to create string with \0 inside, it will consist of 7 characters
It's still won't help if you will use it as c-string ((const) char*). c-strings can't contain zero character.
There are two things to consider: (1) make sure that str already contains the complete literal (the constructor taking only a char* parameter might truncate at the string terminator char). (2) Provided that str actually contains the complete literal, statement memcpy(c,str.data(),7) should work. The only thing then is how you "view" the result, because if you pass c to printf or cout, then they will stop printing once the first string terminating character is reached.
So: To make sure that your string literal "aaa\0bbb" gets completely copied into str, use std::string str("aaa\0bbb",7); Then, try to print the contents of c in a loop, for example:
std::string str("aaa\0bbb",7);
const char *c = str.data();
for (int i=0; i<7; i++) {
printf("%c", c[i] ? c[i] : '0');
}
You already did (not really, see edit below). The problem however, is that whatever you are using to print the string (printf?), is using the c string convention of ending strings with a '\0'. So it starts reading your data, but when it gets to the 0 it will assume it is done (because it has no other way).
If you want to simply write the buffer to the output, you will have to do this with something like
write(stdout, c, 7);
Now write has information about where the data ends, so it can write all of it.
Note however that your terminal cannot really show a \0 character, so it might show some weird symbol or nothing at all. If you are on linux you can pipe into hexdump to see what the binary output is.
EDIT:
Just realized, that your string also initalizes from const char* by reading until the zero. So you will also have to use a constructor to tell it to read past the zero:
std::string("data\0afterzero", 14);
(there are prettier solutions probably)

Convert char * to QString and remove zeros

In my app I read a string field from a file in local (not Unicode) charset.
The field is a 10 bytes, the remainder is filled with zeros if the string < 10 bytes.
char str ="STRING\0\0\0\0"; // that was read from file
QByteArray fieldArr(str,10); // fieldArr now is STRING\000\000\000\000
fieldArr = fieldArr.trimmed() // from some reason array still containts zeros
QTextCodec *textCodec = QTextCodec::codecForLocale();
QString field = textCodec->ToUnicode(fieldArr).trimmed(); // also not removes zeros
So my question - how can I remove trailing zeros from a string?
P.S. I see zeros in "Local and Expressions" window while debuging
I'm going to assume that str is supposed to be char const * instead of char.
Just don't go over QByteArray -- QTextCodec can handle a C string, and it ends with the first null byte:
QString field = textCodec->toUnicode(str).trimmed();
Addendum: Since the string might not be zero-terminated, adding storage for a null byte to the end seems to be impossible, and making a copy to prepare for making a copy seems wasteful, I suggest calculating the length ourselves and using the toUnicode overload that accepts a char pointer and a length.
std::find is good for this, since it returns the ending iterator of the given range if an element is not found in it. This makes special-case handling unnecessary:
QString field = textCodec->toUnicode(str, std::find(str, str + 10, '\0') - str).trimmed();
Does this work for you?
#include <QDebug>
#include <QByteArray>
int main()
{
char str[] = "STRING\0\0\0\0";
auto ba = QByteArray::fromRawData(str, 10);
qDebug() << ba.trimmed(); // does not work
qDebug() << ba.simplified(); // does not work
auto index = ba.indexOf('\0');
if (index != -1)
ba.truncate(index);
qDebug() << ba;
return 0;
}
Using fromRawData() saves an extra copy. Make sure that the str
stays around until you delete the ba.
indexOf() is safe even if you have filled the whole str since
QByteArray knows you only have 10 bytes you can safely access. It
won't touch 11th or later. No buffer overrun.
Once you removed extra \0, it's trivial to convert to a QString.
You can truncate the string after the first \0:
char * str = "STRING\0\0\0\0"; // Assuming that was read from file
QString field(str); // field == "STRING\0\0\0\0"
field.truncate(field.indexOf(QChar::Null)); // field == "STRING" (without '\0' at the end)
I would do it like this:
char* str = "STRING\0\0\0\0";
QByteArray fieldArr;
for(quint32 i = 0; i < 10; i++)
{
if(str[i] != '\0')
{
fieldArr.append(str[i]);
}
}
QString can be constructed from a char array pointer using fromLocal8Bit. The codec is chosen the same way you do manually in your code.
You need to set the length manually to 10 since you say you have no guarantee that an terminating null byte is present.
Then you can use remove() to get rid of all null bytes. Caution: STRI\0\0\0\0NG will also result in STRING but you said that this does not happen.
char *str = "STRING\0\0\0\0"; // that was read from file
QString field = QString::fromLocal8Bit(str, 10);
field.remove(QChar::Null);

Creating an array out of a string

I'm new to C++ and I've encountered a problem... I can't seem to create an array of characters from a string using a for loop. For example, in JavaScript you would write something like this:
var arr = [];
function setString(s) {
for(var i = s.length - 1; i >= 0; i--) {
arr.push(s[i]);
}
return arr.join("");
}
setString("Hello World!"); //Returns !dlroW olleH
I know it's a bit complicated, I do have a little bit of background knowledge on how to do it but the syntax of it is still not too familiar to me.
Is there any way that I could do that in c++ using arrays?
Could I join the array elements into one string as I do in JavaScript?
It would be greately appreciated if you could help. Thanks in advance.
If anyone needs more information just tell me and I'll edit the post.
By the way, my code in c++ is really messy at the moment but I have an idea of what I'm doing... What I've tried is this:
function setString(s) {
string arr[s.size() - 1];
for(int i = s.size() - 1; i >= 0; i--) {
arr[i] = s.at(i); //This is where I get stuck at...
//I don't know if I'm doing something wrong or not.
}
}
It would be nice if someone told me what I'm doing wrong or what I need to put or take out of the code. It's a console application compiled in Code::Blocks
std::string has the c_str() method that returns a C style string, which is just an array of characters.
Example:
std::string myString = "Hello, World!";
const char *characters = myString.c_str();
The closest thing to a direct translation of your function:
string setString(string s) {
string arr;
for(int i = s.length() - 1; i >= 0; i--) {
arr.push_back(s[i]);
}
return arr;
}
A std::string is a dynamic array underneath a fairly thin wrapper. There is no need to copy character by character, as it will do it properly for you:
If the character array is null-terminated (that is, the last element is a '\0'):
const char* c = "Hello, world!"; // '\0' is implicit for string literals
std::string s = c; // this will copy the entire string - no need for your loop
If the character array is not null-terminated:
char c[4] = {'a', 'b', 'c', 'd'}; // creates a character array that will not work with cstdlib string functions (e.g. strlen)
std::string s(c, 4); // copies 4 characters from c into s - again, no need for your loop
If you cannot use std::string (e.g. if you are forced to use ANSI C):
const char* c = "Hello, World!";
// assume c2 is already properly allocated to strlen(c) + 1 and initialized to all zeros
strcpy(c2, c);
In your javascript example, you are reversing the string, which can be done easily enough:
std::string s = "Hello, world!";
std::string s1(s.rbegin(), s.rend());
Additionally, you can cut your iterations in half (for both C++ and Javascript) if you fix your loop (pseudo-code below):
string s = "Hello, world!"
for i = 0 to s.Length / 2
char t = s[i]
s[i] = s[s.Length - 1 - t]
s[s.Length - 1 - i] = t
Which will swap the ends of the string to reverse it. Instead of looping through N items, you loop through a maximum of N / 2 items.

Converting Zero-Terminated String To D String

Is there a function in Phobos for converting a zero-terminated string into a D-string?
So far I've only found the reverse case toStringz.
I need this in the following snippet
// Lookup user name from user id
passwd pw;
passwd* pw_ret;
immutable size_t bufsize = 16384;
char* buf = cast(char*)core.stdc.stdlib.malloc(bufsize);
getpwuid_r(stat.st_uid, &pw, buf, bufsize, &pw_ret);
if (pw_ret != null) {
// TODO: The following loop maybe can be replace by some Phobos function?
size_t n = 0;
string name;
while (pw.pw_name[n] != 0) {
name ~= pw.pw_name[n];
n++;
}
writeln(name);
}
core.stdc.stdlib.free(buf);
which I use to lookup the username from a user id.
I assume UTF-8 compatiblity for now.
There's two easy ways to do it: slice or std.conv.to:
const(char)* foo = c_function();
string s = to!string(foo); // done!
Or you can slice it if you are going to use it temporarily or otherwise know it won't be written to or freed elsewhere:
immutable(char)* foo = c_functon();
string s = foo[0 .. strlen(foo)]; // make sure foo doesn't get freed while you're still using it
If you think it can be freed, you can also copy it by slicing then duping: foo[0..strlen(foo)].dup;
Slicing pointers works the same way in all array cases, not just strings:
int* foo = get_c_array(&c_array_length); // assume this returns the length in a param
int[] foo_a = foo[0 .. c_array_length]; // because you need length to slice
Just slice the original string (no coping). The $ inside [] is translated to str.length. If the zero is not at the end, just replace the "$ - 1" expression with position.
void main() {
auto str = "abc\0";
str.trimLastZero();
write(str);
}
void trimLastZero (ref string str) {
if (str[$ - 1] == 0)
str = str[0 .. $ - 1];
}
You can do the following to strip away the trailing zeros and convert it to a string:
char[256] name;
getNameFromCFunction(name.ptr, 256);
string s = to!string(cast(char*)name); //<-- this is the important bit
If you just pass in name you will convert it to a string but the trailing zeroes will still be there. So you cast it to a char pointer and voila std.conv.to will convert whatever it meets until a '\0' is encountered.

Combining std::string and std::vector<char>

This is not the actual code, but this represents my problem.
std::string str1 = "head";
char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(buffer, buffer + strlen(buffer));
I want to put str1 and str2 to mainStr in an order:
headbody\0bodyfoot
So the binary data is maintained. Is this possible to do this?
PS: Thanks for telling the strlen part is wrong. I just used it to represent buffer's length. :)
There should be some way of defining length of data in "buffer".
Usually character 0 is used for this and most of standard text functions assume this. So if you use character 0 for other purposes, you have to provide another way to find out length of data.
Just for example:
char buffer[]="body\0body";
std::vector<char> mainStr(buffer,buffer+sizeof(buffer)/sizeof(buffer[0]));
Here we use array because it provides more information that a pointer - size of stored data.
You cannot use strlen as it uses '\0' to determine the end of string. However, the following will do what you are looking for:
std::string head = "header";
std::string foot = "footer";
const char body[] = "body\0body";
std::vector<char> v;
v.assign(head.begin(), head.end());
std::copy(body, body + sizeof(body)/sizeof(body[0]) - 1, std::back_inserter<std::vector<char> >(v));
std::copy(foot.begin(), foot.end(), std::back_inserter<std::vector<char> >(v));
Because the character buffer adds an NUL character at the end of the string, you'll want to ignore it (hence the -1 from the last iterator).
btw. strlen will not work if there are nul bytes in your string!
The code to insert into the vector is:
front:
mainStr.insert(mainStr.begin(), str1.begin(), str1.end());
back:
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
With your code above (using strlen will print)
headbodyfoot
EDIT: just changed the copy to insert as copy requires the space to be available I think.
You could use std::vector<char>::insert to append the data you need into mainStr.
Something like this:
std::string str1 = "head";
char buffer[] = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::vector<char> mainStr(str1.begin(), str1.end());
mainStr.insert(mainStr.end(), buffer, buffer + sizeof(buffer)/sizeof(buffer[0]));
mainStr.insert(mainStr.end(), str2.begin(), str2.end());
Disclaimer: I didn't compile it.
You can use IO streams.
std::string str1 = "head";
const char *buffer = "body\0body"; // Original code has nullbytes;
std::string str2 = "foot";
std::stringstream ss;
ss.write(str1.c_str(), str1.length())
.write(buffer, 9) // insert real length here
.write(str2.c_str(), str2.length());
std::string result = ss.str();
std::vector<char> vec(result.c_str(), result.c_str() + result.length());
str1 and str2 are string objects that write the text.
I wish compilers would fail on statements like the declaration of buffer and I don't care how much legacy code it breaks. If you're still building it you can still fix it and put in a const.
You would need to change your declaration of vector because strlen will stop at the first null character. If you did
char buffer[] = "body\0body";
then sizeof(buffer) would actually give you close to what you want although you'll get the end null-terminator too.
Once your vector mainStr is then set up correctly you could do:
std::string strConcat;
strConcat.reserve( str1.size() + str2.size() + mainStr.size() );
strConcat.assign(str1);
strConcat.append(mainStr.begin(), mainStr.end());
strConcat.append(str2);
if vector was set up using buffer, buffer+sizeof(buffer)-1
mainStr.resize(str1.length() + str2.length() + strlen(buffer));
memcpy(&mainStr[0], &str1[0], str1.length());
memcpy(&mainStr[str1.length()], buffer, strlen(buffer));
memcpy(&mainStr[str1.length()+strlen(buffer)], &str2[0], str2.length());