whats the difference between C strings and C++ strings? - c++

whats the difference between C Strings and C++ strings. Specially while doing dynamic memory allocation

I hardly know where to begin :-)
In C, strings are just char arrays which, by convention, end with a NUL byte. In terms of dynamic memory management, you can simply malloc the space for them (including the extra byte). Memory management when modifying strings is your responsibility:
char *s = strdup ("Hello");
char *s2 = malloc (strlen (s) + 6);
strcpy (s2, s);
strcat (s2, ", Pax");
free (s);
s = s2;
In C++, strings (std::string) are objects with all the associated automated memory management and control which makes them a lot safer and easier to use, especially for the novice. For dynamic allocation, use something like:
std::string s = "Hello";
s += ", Pax";
I know which I'd prefer to use, the latter. You can (if you need one) always construct a C string out of a std::string by using the c_str() method.

C++ strings are much safer,easier,and they support different string manipulation functions like append,find,copy,concatenation etc.
one interesting difference between c string and c++ string is illustrated through following example
#include <iostream>
using namespace std;
int main() {
char a[6]; //c string
a[5]='y';
a[3]='o';
a[2]='b';
cout<<a;
return 0;
}
output »¿boRy¤£f·Pi»¿
#include <iostream>
using namespace std;
int main()
{
string a; //c++ string
a.resize(6);
a[5]='y';
a[3]='o';
a[2]='b';
cout<<a;
return 0;
}
output boy
I hope you got the point!!

The difference is speed in some operations, and encapsulation of places where common errors occur.
std::string maintains information about the content and length of the string. This means it is not prone to buffer overruns in the same way a const char * is. It will also be faster for most operations than a const char * because the length of the string is known.
Most of the cstring operations call strlen() under the hood to find out the length of the string before operating on it, this will kill performance.
std::string is compatible with STL algorithms and other containers.
Lastly, std::string doesn't have the same "gotcha!" moments a char * will have:

Related

std::strcpy and std::strcat with a std::string argument

This is from C++ Primer 5th edition. What do they mean by "the size of largeStr". largeStr is an instance of std::string so they have dynamic sizes?
Also I don't think the code compiles:
#include <string>
#include <cstring>
int main()
{
std::string s("test");
const char ca1[] = "apple";
std::strcpy(s, ca1);
}
Am I missing something?
strcpy and strcat only operate on C strings. The passage is confusing because it describes but does not explicitly show a manually-sized C string. In order for the second snippet to compile, largeStr must be a different variable from the one in the first snippet:
char largeStr[100];
// disastrous if we miscalculated the size of largeStr
strcpy(largeStr, ca1); // copies ca1 into largeStr
strcat(largeStr, " "); // adds a space at the end of largeStr
strcat(largeStr, ca2); // concatenates ca2 onto largeStr
As described in the second paragraph, largeStr here is an array. Arrays have fixed sizes decided at compile time, so we're forced to pick some arbitrary size like 100 that we expect to be large enough to hold the result. However, this approach is "fraught with potential for serious error" because strcpy and strcat don't enforce the size limit of 100.
Also I don't think the code compiles...
As above, change s to an array and it will compile.
#include <cstring>
int main()
{
char s[100] = "test";
const char ca1[] = "apple";
std::strcpy(s, ca1);
}
Notice that I didn't write char s[] = "test";. It's important to reserve extra space for "apple" since it's longer than "test".
You are missing the
char largeStr[100];
or similar the book doesn't mention.
What you should do is forget about strcpy and strcat and C-style strings real quick. Just remember how to make c++ std::string out of them and never look back.

how character pointer could be used to point a string in c++?

First of all I am beginner in C++. I was trying to learn about type casting in C++ with strings and character pointer. Is it possible to point a string with a character pointer?
int main() {
string data="LetsTry";
cout<<(&data)<<"\n";
cout<<data<<"\n"<<"size "<<sizeof(data)<<"\n";
//char *ptr = static_cast<char*>(data);
//char *ptr=(char*)data;
char *ptr = reinterpret_cast<char*>(&data);
cout<<(ptr)<<"\n";
cout<<*ptr;
}
The above code yields outcome as below:
0x7ffea4a06150
LetsTry
size 32
`a���
`
I understand as ptr should output the address 0x7ffea4a06150
Historically, in C language strings were just a memory areas filled with characters. Consequently, when a string was passed to a function, it was passed as a pointer to its very first character, of type char *, for mutable strings, or char const *, if the function had no intent to modify string's contents. Such strings were delimited with a zero-character ((char)0 a.k.a. '\0') at the end, so for a string of length 3 you had to allocate at least four bytes of memory (three characters of the string itself plus the zero terminator); and if you only had a pointer to a string's start, to know the size of the string you'd have to iterate it to find how far is the zero-char (the standard function strlen did it). Some standard functions accepted en extra parameter for a string size if you knew it in advance (those starting with strn or, more primitive and effective, those starting with mem), others did not. To concatenate two strings you first had to allocate a sufficient buffer to contain the result etc.
The standard functions that process char pointers can still be found in STL, under the <cstring> header: https://en.cppreference.com/w/cpp/header/cstring, and std::string has synonymous methods c_str() and data() that return char pointers to its contents, should you need it.
When you write a program in C++, its main function has the header of int main(int argc, char *argv[]), where argv is the array of char pointers that contains any command-line arguments your program was run with.
Ineffective as it is, this scheme could still be regarded as an advantage over strings of limited capacity or plain fixed-size character arrays, for instance in mid-nineties, when Borland introduced the PChar type in Turbo Pascal and added a unit that exported Pascal implementations of functions from C's string.h.
std::string and const char* are different types, reinterpret_cast<char*>(&data) means reinterpret the bits located at &data as const char*, which is not we want in this case.
so assuming we have type A and type B:
A a;
B b;
the following are conversion:
a = (A)b; //c sytle
// and
a = A(b);
// and
a = static_cast<A>(b); //c++ style
the following are bit reinterpretation:
a = *(A*)&b; //c style
// and
a = *reinterpret_cast<A*>(&b); //c++ style
finally, this should works:
int main() {
string data = "LetsTry";
const char *ptr = data.c_str();
cout<< ptr << "\n";
}
bit reinterpretation is sometimes used, like when doing bit manipulation of a floating point number, but there are some rules to follow like this one What is the strict aliasing rule?
also note that cout << ptr << "\n"; is a specially case because feeds a pointer to std::cout usually output the address that pointer points to, but std::cout treats char* specially so that it output the content of that char array instead
In C++, string is class and what you doing is creating a string object. So, to use are char * you need to convert it using c_str()
You can refer below code:
std::string data = "LetsTry";
// declaring character array
char * cstr = new char [data.length()+1];
// copying the contents of the
// string to char array
std::strcpy (cstr, data.c_str());
Now, you can get use char * to point your data.

Pointer to char vs String

Consider these two pieces of code. They're converting base10 number to baseN number, where N is the number of characters in given alphabet. Actually, they generate permutations of letters of given alphabet. It's assumed that 1 is equal to first letter of the alphabet.
#include <iostream>
#include <string>
typedef unsigned long long ull;
using namespace std;
void conv(ull num, const string alpha, string *word){
int base=alpha.size();
*word="";
while (num) {
*word+=alpha[(num-1)%base];
num=(num-1)/base;
}
}
int main(){
ull nu;
const string alpha="abcdef";
string word;
for (nu=1;nu<=10;++nu) {
conv(nu,alpha,&word);
cout << word << endl;
}
return 0;
}
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef unsigned long long ull;
void conv(ull num, const char* alpha, char *word){
int base=strlen(alpha);
while (num) {
(*word++)=alpha[(num-1)%base];
num=(num-1)/base;
}
}
int main() {
char *a=calloc(10,sizeof(char));
const char *alpha="abcdef";
ull h;
for (h=1;h<=10;++h) {
conv(h,alpha,a);
printf("%s\n", a);
}
}
Output is the same:
a
b
c
d
aa
ba
ca
da
No, I didn't forget to reverse the strings, reversal was removed for code clarification.
For some reason speed is very important for me. I've tested the speed of executables compiled from the examples above and noticed that the one written n C++ using string is more than 10 times less fast than the one written in C using char *.
Each executable was compiled with -O2 flag of GCC. I was running tests using much bigger numbers to convert, such as 1e8 and more.
The question is: why is string less fast than char * in that case?
Your code snippets are not equivalent. *a='n' does not append to the char array. It changes the first char in the array to 'n'.
In C++, std::strings should be preferred to char arrays, because they're a lot easier to use, for example appending is done simply with the += operator.
Also they automatically manage their memory for you which char arrays don't do. That being said, std::strings are much less error prone than the manually managed char arrays.
Doing a trace of your code you get:
*a='n';
// 'n0000'
// ^
// a
++a;
// 'n0000'
// ^
// a
*a='o'
// 'no000'
// ^
// a
In the end, a points to its original address + 1, wich is o. If you print a you will get 'o'.
Anyways, what if you need 'nothing' instead of 'no'? It wont fit in 5 chars and you will need to reallocate mem etc. That kind of things is what string class do for you behind the scenes, and faster enough so it's not a problem almost every scenario.
It's possible to use both char * and string to handle some text in C++. It seems to me that string addition is much slower than pointer addition. Why does this happen?
That is because when you use a char array or deal with a pointer to it (char*) the memory is only allocated once. What you describe with "addition" is only an iteration of the pointer to the array. So its just moving of a pointer.
// Both allocate memory one time:
char test[4];
char* ptrTest = new char[4];
// This will just set the values which already exist in the array and will
// not append anything.
*(ptrTest++) = 't'
*(ptrTest++) = 'e';
*(ptrTest++) = 's';
*(ptrTest) = 't';
When you use a string instead, the += operator actually appends characters to the end of your string. In order to accomplish this, memory will be dynamically allocated every time you append something to the string. This process does take longer than just iterating a pointer.
// This will allocate space for one character on every call of the += operator
std::string test;
test += 't';
test += 'e';
test += 's';
test += 't';
std::string a(2,' ');
a[0] = 'n';
a[1] = 'o';
Change the size of your string in the constructor or use the reserve, resize methods, that is your choice.
You are mixing different things in your question, one is a raw representation of bytes that can get interpreted as a string, no semantics or checks, the other is an abstraction of a string with checks, believe me, it is a lot of more important the security and avoid segfaults that can lead on code injection and privilege escalation than 2ms.
From the std::string documentation (here) you can see, that the
basic_string& operator+=(charT c)
is equivalent to calling push_back(c) on that string, so
string a;
a+='n';
a+='o';
is equivalent to:
string a;
a.push_back('n');
a.push_back('o');
The push_back does take care of a lot more than the raw pointer operations and is thus slower. It for instance takes care of automatic memory management of the string class.

Segmentation fault when appending two chars - C++

I am trying to append two chars but for some reason I am getting a segmentation fault.
My code is like;
#include <string.h>
char *one = (char*)("one");
char *two = (char*)("two");
strcat(one, two);
and I seem to be getting a segmentation fault at strcat(one, two), why is that?
http://www.cplusplus.com/reference/clibrary/cstring/strcat/
the first parameter to strcat, must be big enough to hold the resulting string
try:
//assuming a,b are char*
char* sum = new char[strlen(a) +strlen(b)+1];
strcpy(sum,a);
strcat(sum,b);
There should be enough legal memory to hold the entire string.
char *one = new char[128]; //allocating enough memory!
const char *two = "two"; //"two" is const char*
strcpy(one, "one");
strcat(one, two); //now the variable "one" has enough memory to hold the entire string
By the way, if you prefer using std::string over char* in C++, such thing would be easier to handle:
#include<string>
std::string one = "one";
std::string two = "two";
one = one + two; //Concatenate
std::cout << one;
Output:
onetwo
There are two reasons for this.
If you have a pointer initialized to a string literal, that memory is read-only and modifying it will result in undefined behavior. In this case, if you try to append a string to a string literal, you'll be modifying this sort of memory, which will result in problems.
When using strcat you need to guarantee that space exists for the concatenation of the string at the location you're specifying. In this case, you cannot guarantee that, since a string literal is only guaranteed to have enough space to hold the literal itself.
To fix this, you'll want to explicitly allocate a buffer large enough to hold the concatenation of the two strings, including the null terminator. Here's one approach:
char* buffer = malloc(strlen(one) + strlen(two) + 1);
strcpy(buffer, one);
strcat(buffer, two);
Hope this helps!
The seg fault is because you attempt to write to read only memory. The first action of the strcat is to copy of 't' from the first entry of two into the null at the end of "one". So strictly the seg fault is not due to lack of storage - we never get that far. In fact this code will also likely give you a seg fault:
char* one = "one";
char* two = "";
strcat(one, two);
All this tries to do is copy a null over a null, but in read-only memory. I suppose a optimiser might happen to stop this on some platforms.
Oddly enough the following (incorrect) code will (probably) not give you a seg fault, and even give the "right" answer:
char one[] = "one";
char two[] = "two";
strcat(one, two);
printf("%s\n", one);
This successfully writes "onetwo" to stdout on my machine. We get a stack scribble, which we happen to get away with.
On the other hand this does seg fault:
char* one = "one "; // Plenty of storage, but not writable.
char two[] = "two";
strcat(one,two);
Hence the solution:
const unsigned enoughSpace = 32;
char one[enoughSpace] = "one";
char two[] = "two";
strcat(one,two);
printf("%s\n", one);
The issue with this is of course, how large to make enoughSpace in order to store what ever is coming?
Hence the functions strncat, or strcat_s, or more easily std::string.
Moral of the story: in C++, just like C, you really need to know what your memory layout is.
There are several problems here. Firstly, though you have casted the strings to mutable versions, they really are string literals and hence should not be written. Secondly, you are using strcat which will write into the string buffer, completely ignoring the length of the string buffer (it's better to use strncat which requires you to specify the length of the buffer). Lastly, since this is C++, it would be way better to use:
#include <string>
// ...
string one = "one";
string two = "two";
one.append(two);
You never reserved some space for your strings.
#include <string.h>
#include <stdio.h>
int main(void){
char str[20] = "";
strcat(str, "one");
strcat(str, "two");
printf("%s", str);
}
Would be one correct way to do this. The other (and way better) is to use the std::string class.
#include <string>
#include <cstdio>
int main(void){
std::string str;
str += "one";
str += "two";
std::printf("%s", str.c_str());
}
Your destination string should be large enough to hold both the destination and the source string. So an example would be
char one[10] = "one";
char two[4] = "two";
strcat(one,two);
strcat needs a "writeable" buffer as the target. In your example, it is a pointer to a string constant (or literal), which you cannot write to, thus it results in an exception. The target buffer can be a buffer on the stack or one dynamically allocated (e.g., with malloc).
This is not the problem of "not enough space".
char *a = "str";
Look at the code above, the pointer a is point to "static memory". The string "str" is stored in the static place in the PCB, which means it can't be overwritten.
So, the codes below will be better:
#include <string>
using std::string;
string a = "stra";
string b = "strb";
a += b;

Difference between string and char[] types in C++

For C, we use char[] to represent strings.
For C++, I see examples using both std::string and char arrays.
#include <iostream>
#include <string>
using namespace std;
int main () {
string name;
cout << "What's your name? ";
getline(cin, name);
cout << "Hello " << name << ".\n";
return 0;
}
#include <iostream>
using namespace std;
int main () {
char name[256];
cout << "What's your name? ";
cin.getline(name, 256);
cout << "Hello " << name << ".\n";
return 0;
}
(Both examples adapted from http://www.cplusplus.com.)
What is the difference between these two types in C++? (In terms of performance, API integration, pros/cons, ...)
A char array is just that - an array of characters:
If allocated on the stack (like in your example), it will always occupy eg. 256 bytes no matter how long the text it contains is
If allocated on the heap (using malloc() or new char[]) you're responsible for releasing the memory afterwards and you will always have the overhead of a heap allocation.
If you copy a text of more than 256 chars into the array, it might crash, produce ugly assertion messages or cause unexplainable (mis-)behavior somewhere else in your program.
To determine the text's length, the array has to be scanned, character by character, for a \0 character.
A string is a class that contains a char array, but automatically manages it for you. Most string implementations have a built-in array of 16 characters (so short strings don't fragment the heap) and use the heap for longer strings.
You can access a string's char array like this:
std::string myString = "Hello World";
const char *myStringChars = myString.c_str();
C++ strings can contain embedded \0 characters, know their length without counting, are faster than heap-allocated char arrays for short texts and protect you from buffer overruns. Plus they're more readable and easier to use.
However, C++ strings are not (very) suitable for usage across DLL boundaries, because this would require any user of such a DLL function to make sure he's using the exact same compiler and C++ runtime implementation, lest he risk his string class behaving differently.
Normally, a string class would also release its heap memory on the calling heap, so it will only be able to free memory again if you're using a shared (.dll or .so) version of the runtime.
In short: use C++ strings in all your internal functions and methods. If you ever write a .dll or .so, use C strings in your public (dll/so-exposed) functions.
Arkaitz is correct that string is a managed type. What this means for you is that you never have to worry about how long the string is, nor do you have to worry about freeing or reallocating the memory of the string.
On the other hand, the char[] notation in the case above has restricted the character buffer to exactly 256 characters. If you tried to write more than 256 characters into that buffer, at best you will overwrite other memory that your program "owns". At worst, you will try to overwrite memory that you do not own, and your OS will kill your program on the spot.
Bottom line? Strings are a lot more programmer friendly, char[]s are a lot more efficient for the computer.
Well, string type is a completely managed class for character strings, while char[] is still what it was in C, a byte array representing a character string for you.
In terms of API and standard library everything is implemented in terms of strings and not char[], but there are still lots of functions from the libc that receive char[] so you may need to use it for those, apart from that I would always use std::string.
In terms of efficiency of course a raw buffer of unmanaged memory will almost always be faster for lots of things, but take in account comparing strings for example, std::string has always the size to check it first, while with char[] you need to compare character by character.
I personally do not see any reason why one would like to use char* or char[] except for compatibility with old code. std::string's no slower than using a c-string, except that it will handle re-allocation for you. You can set it's size when you create it, and thus avoid re-allocation if you want. It's indexing operator ([]) provides constant time access (and is in every sense of the word the exact same thing as using a c-string indexer). Using the at method gives you bounds checked safety as well, something you don't get with c-strings, unless you write it. Your compiler will most often optimize out the indexer use in release mode. It is easy to mess around with c-strings; things such as delete vs delete[], exception safety, even how to reallocate a c-string.
And when you have to deal with advanced concepts like having COW strings, and non-COW for MT etc, you will need std::string.
If you are worried about copies, as long as you use references, and const references wherever you can, you will not have any overhead due to copies, and it's the same thing as you would be doing with the c-string.
One of the difference is Null termination (\0).
In C and C++, char* or char[] will take a pointer to a single char as a parameter and will track along the memory until a 0 memory value is reached (often called the null terminator).
C++ strings can contain embedded \0 characters, know their length without counting.
#include<stdio.h>
#include<string.h>
#include<iostream>
using namespace std;
void NullTerminatedString(string str){
int NUll_term = 3;
str[NUll_term] = '\0'; // specific character is kept as NULL in string
cout << str << endl <<endl <<endl;
}
void NullTerminatedChar(char *str){
int NUll_term = 3;
str[NUll_term] = 0; // from specific, all the character are removed
cout << str << endl;
}
int main(){
string str = "Feels Happy";
printf("string = %s\n", str.c_str());
printf("strlen = %d\n", strlen(str.c_str()));
printf("size = %d\n", str.size());
printf("sizeof = %d\n", sizeof(str)); // sizeof std::string class and compiler dependent
NullTerminatedString(str);
char str1[12] = "Feels Happy";
printf("char[] = %s\n", str1);
printf("strlen = %d\n", strlen(str1));
printf("sizeof = %d\n", sizeof(str1)); // sizeof char array
NullTerminatedChar(str1);
return 0;
}
Output:
strlen = 11
size = 11
sizeof = 32
Fee s Happy
strlen = 11
sizeof = 12
Fee
Think of (char *) as string.begin(). The essential difference is that (char *) is an iterator and std::string is a container. If you stick to basic strings a (char *) will give you what std::string::iterator does. You could use (char *) when you want the benefit of an iterator and also compatibility with C, but that's the exception and not the rule. As always, be careful of iterator invalidation. When people say (char *) isn't safe this is what they mean. It's as safe as any other C++ iterator.
Strings have helper functions and manage char arrays automatically. You can concatenate strings, for a char array you would need to copy it to a new array, strings can change their length at runtime. A char array is harder to manage than a string and certain functions may only accept a string as input, requiring you to convert the array to a string. It's better to use strings, they were made so that you don't have to use arrays. If arrays were objectively better we wouldn't have strings.