Null character behavior c++ - c++

#include <iostream>
#include <string>
using namespace std;
int main() {
string s = "hello";
cout << s[5] << endl;
return 0;
}
In the above code, if I print s[5], it correctly prints a NULL character. But if I change the code to this:
#include <iostream>
#include <string>
using namespace std;
int main() {
char[] s = {'a','b','c','d','e'};
cout << s[5] << endl;
return 0;
}
It doesn't print a NULL character but something random. If I store the string as a string or as a char*, then the behavior is in tune with what I expect.
But if I explicitly declare the character array, how does the compiler know when the array ends? Does the size of the array gets stored at compile time?

String literals and std::strings store null terminated strings.
But an array of 5 char declared like:
char s[] = {'a','b','c','d','e'};
contains only 5 char, no null terminator.
But the compiler does know the size of s. It is part of the type of s. It has no convenient .size() function like std::string, std::vector or std::array does but you can get it by doing:
sizeof(s) / sizeof(s[0])
Or more safely in C++11:
std::extent<decltype(s)>::value
Or in C++17:
std::size(s)
(demo)
Arrays have a habit of decaying to pointers though and then there is no way of getting the size, you have to keep track of it yourself. Which is why std::string, std::vector or std::array is preferred in C++.

Strings are null-terminated, and const char* are treated the same way as Strings are. When you declare a array with a size it's put on the stack and the compiler doesn't know the size. Array out-of-bounds exceptions aren't determined during compile time.

the string class in c++ has the constructor which by itself adds the null character to the string passed to it if not explicitly added. But while using char it only stores the content passed to it (i.e) if you want to have a null character you have to explicitly add in the declaration or the definition of that char.

When you do char[] s = {'a','b','c','d','e'};, it will store characters mentioned and nothing else.
if I explicitly declare the character array, how does the compiler know when the array ends?
size is determined by number of characters provided by you.
Does the size of the array gets stored at compile time?
no, the size of array is determined by memory blocks allocated to it. (It is not stored separately in memory, if that's what you meant)
And when you use this string s = "hello";, strings are always null terminated.

Your code is char s[] = {'a','b','c','d','e'};, so it will not put the \0 at the end of your char array. It will put the \0 with three methods below:
1. char s[] = {'a','b','c','d','e', '\0'};
2. char s[] = "abcde";
3. string s = "abcde";
So if you use any of the three above, you will get a NULL character.

"how does the compiler know when the array ends ?": the compiler knows how many elements the array has, from its declaration, and this information is available through the sizeof operator.
Anyway C-style arrays have virtually no size, as they are implicitly turned to pointers when passed as arguments, and their length is dropped (IMO a major flaw in the design of the C language). Overflow avoidance is your responsibility.
For this reason, you mustn't use a cout << statement if your string isn't null-terminated.

Related

std::strcpy and std::strcat with a std::string argument

This is from C++ Primer 5th edition. What do they mean by "the size of largeStr". largeStr is an instance of std::string so they have dynamic sizes?
Also I don't think the code compiles:
#include <string>
#include <cstring>
int main()
{
std::string s("test");
const char ca1[] = "apple";
std::strcpy(s, ca1);
}
Am I missing something?
strcpy and strcat only operate on C strings. The passage is confusing because it describes but does not explicitly show a manually-sized C string. In order for the second snippet to compile, largeStr must be a different variable from the one in the first snippet:
char largeStr[100];
// disastrous if we miscalculated the size of largeStr
strcpy(largeStr, ca1); // copies ca1 into largeStr
strcat(largeStr, " "); // adds a space at the end of largeStr
strcat(largeStr, ca2); // concatenates ca2 onto largeStr
As described in the second paragraph, largeStr here is an array. Arrays have fixed sizes decided at compile time, so we're forced to pick some arbitrary size like 100 that we expect to be large enough to hold the result. However, this approach is "fraught with potential for serious error" because strcpy and strcat don't enforce the size limit of 100.
Also I don't think the code compiles...
As above, change s to an array and it will compile.
#include <cstring>
int main()
{
char s[100] = "test";
const char ca1[] = "apple";
std::strcpy(s, ca1);
}
Notice that I didn't write char s[] = "test";. It's important to reserve extra space for "apple" since it's longer than "test".
You are missing the
char largeStr[100];
or similar the book doesn't mention.
What you should do is forget about strcpy and strcat and C-style strings real quick. Just remember how to make c++ std::string out of them and never look back.

how character pointer could be used to point a string in c++?

First of all I am beginner in C++. I was trying to learn about type casting in C++ with strings and character pointer. Is it possible to point a string with a character pointer?
int main() {
string data="LetsTry";
cout<<(&data)<<"\n";
cout<<data<<"\n"<<"size "<<sizeof(data)<<"\n";
//char *ptr = static_cast<char*>(data);
//char *ptr=(char*)data;
char *ptr = reinterpret_cast<char*>(&data);
cout<<(ptr)<<"\n";
cout<<*ptr;
}
The above code yields outcome as below:
0x7ffea4a06150
LetsTry
size 32
`a���
`
I understand as ptr should output the address 0x7ffea4a06150
Historically, in C language strings were just a memory areas filled with characters. Consequently, when a string was passed to a function, it was passed as a pointer to its very first character, of type char *, for mutable strings, or char const *, if the function had no intent to modify string's contents. Such strings were delimited with a zero-character ((char)0 a.k.a. '\0') at the end, so for a string of length 3 you had to allocate at least four bytes of memory (three characters of the string itself plus the zero terminator); and if you only had a pointer to a string's start, to know the size of the string you'd have to iterate it to find how far is the zero-char (the standard function strlen did it). Some standard functions accepted en extra parameter for a string size if you knew it in advance (those starting with strn or, more primitive and effective, those starting with mem), others did not. To concatenate two strings you first had to allocate a sufficient buffer to contain the result etc.
The standard functions that process char pointers can still be found in STL, under the <cstring> header: https://en.cppreference.com/w/cpp/header/cstring, and std::string has synonymous methods c_str() and data() that return char pointers to its contents, should you need it.
When you write a program in C++, its main function has the header of int main(int argc, char *argv[]), where argv is the array of char pointers that contains any command-line arguments your program was run with.
Ineffective as it is, this scheme could still be regarded as an advantage over strings of limited capacity or plain fixed-size character arrays, for instance in mid-nineties, when Borland introduced the PChar type in Turbo Pascal and added a unit that exported Pascal implementations of functions from C's string.h.
std::string and const char* are different types, reinterpret_cast<char*>(&data) means reinterpret the bits located at &data as const char*, which is not we want in this case.
so assuming we have type A and type B:
A a;
B b;
the following are conversion:
a = (A)b; //c sytle
// and
a = A(b);
// and
a = static_cast<A>(b); //c++ style
the following are bit reinterpretation:
a = *(A*)&b; //c style
// and
a = *reinterpret_cast<A*>(&b); //c++ style
finally, this should works:
int main() {
string data = "LetsTry";
const char *ptr = data.c_str();
cout<< ptr << "\n";
}
bit reinterpretation is sometimes used, like when doing bit manipulation of a floating point number, but there are some rules to follow like this one What is the strict aliasing rule?
also note that cout << ptr << "\n"; is a specially case because feeds a pointer to std::cout usually output the address that pointer points to, but std::cout treats char* specially so that it output the content of that char array instead
In C++, string is class and what you doing is creating a string object. So, to use are char * you need to convert it using c_str()
You can refer below code:
std::string data = "LetsTry";
// declaring character array
char * cstr = new char [data.length()+1];
// copying the contents of the
// string to char array
std::strcpy (cstr, data.c_str());
Now, you can get use char * to point your data.

how to initialize static char array with NULL(or 0) in c++

I attempted to initialize char array with NULL like this syntax.
char str[5] = NULL;
But it returned error..
How can I initialize char array with NULL or 0 in C++?
Concretely, I want to print "Hello" in this example code.
#include <iostream>
int main()
{
char str[5] = NULL;
if(!str)
std::cout << "Hello" << std::endl;
return 0;
}
This code will return error because of incorrect initialization. Then, what initializing sentence should I replace sentence with?
An array can not be null. Null is state of a pointer, and an array is not a pointer.
In the expression !str, the array will decay to the address of the first element. The type of this decayed address is a pointer, but you cannot modify the decayed pointer. Since the array is never stored in the address 0 (except maybe in some special case on an embedded system where an array might actually be stored at that memory location), !str will never be true.
What you can do, is initialize all of the sub-objects of the array to zero. This can be achieved using the value-initialization syntax:
char str[5]{};
As I explained earlier, !str is still not meaningful, and is always false. One sensible thing that you might do is check whether the array contains an empty string. An empty string contains nothing except the terminator character (the array can have elements after the terminator, but those are not part of the string).
Simplest way to check whether a string is empty is to check whether the first character is the terminator (value-initialized character will be a terminator, so a value initialized character array does indeed contain the empty string):
if (!str[0]) // true if the string is empty
//...
char str[5] = {'\0'};
if (str[0] != '\0')
//...
If you now put some characters into str it will print str up to the last '\0'.
Every string literal ends with '\0', you must make sure your array ends with '\0' too, if not, data will be read beyond your array (until '\0' is encountered) and possibly beyond your application's memory space in which case your app will crash.
You should use std::string or QString however if you need a string.
You can't initialise a char array with NULL, arrays can never be NULL. You seem to be mixing up pointers and arrays. A pointer could be initialised with NULL.
You can initialise the chars of your array with 0, that would be
char str[5] = {0};
but I don't think that's what you're asking.
#include <stdio.h>
using namespace std;
int main() {
char str[5];
for(int i=0;i<5;i++){
str[i]=NULL;
}
printf("success");
return 0;
}
Hope this helps.

C++ copying char to a char array (Debug assertion failed) says string is not null terminated

Just trying to assign chars to the char array and it says string in not null terminated?
I want to be able to change the teams around in the array like a scoreboard.
#include <string.h>
#include <iostream>
int main(int argc, char* argv[])
{
char Team1[7] = "Grubs";
char Team2[7] = "Giants";
char Team3[7] = "Bulls";
char Team4[7] = "Snakes";
char Team5[7] = "Echos";
char TeamList[5][7];
strcpy_s(TeamList[0], Team1);
strcat_s(TeamList[1], Team2);
strcat_s(TeamList[2], Team3);
strcat_s(TeamList[3], Team4);
strcat_s(TeamList[4], Team5);
TeamList[5][7]= '\0';
system("pause");
return 0;
}
strcat() (which is a "less-safe" version of strcat_s()) requires both strings to be null-terminated. That's because strcat() appends its second parameter (source) where first parameter (dest) ends. It replaces null-terminator of dest with first character of source, appends rest of source and then
a null-character is included at the end of the new string formed by
the concatenation of both
I would simply change
strcpy_s(TeamList[0], Team1);
strcat_s(TeamList[1], Team2);
strcat_s(TeamList[2], Team3);
strcat_s(TeamList[3], Team4);
strcat_s(TeamList[4], Team5);
to
strcpy_s(TeamList[0], Team1);
strcpy_s(TeamList[1], Team2);
strcpy_s(TeamList[2], Team3);
strcpy_s(TeamList[3], Team4);
strcpy_s(TeamList[4], Team5);
strcpy_s() does not have any requirements regarding contents of destination - only its capacity matters.
If you want to stick with strcat_s(), do this:
char TeamList[5][7];
memset(TeamList, 0, sizeof(char) * 5 * 7);
Then, this line:
TeamList[5][7]= '\0';
is not required, It is incorrect anyway, because for N-element array valid indexes are [0; N-1].
EDIT
Since in your case swapping comes into play, I would suggest you totally different approach.
First of all:
#include <string>
Then, initialize teams this way:
std::string TeamList[] =
{
"Grubs",
"Giants",
"Bulls",
"Snakes",
"Echos"
};
Now, TeamList is an array containing 5 elements and each of these elements is an object of type std::string, containing name of a particular team.
Now, if you want to swap, let's say, teams 1 and 3:
std::swap(TeamList[1], TeamList[3]);
std::swap() is a standard C++ function extensively used in standard library implementation. It is overloaded for many standard types, including std::string. This solution has one, critical benefit: if string's content is held on the heap, swapping two strings is as simple as swapping pointers (and some length/capacity variables).
Oh, and one more thing: if you are not familiar with std::string and you would need to get pointer to a buffer containing string's data, you can do it this way:
const char* team_1_raw_name = TeamList[0].c_str();
See this page for more info about std::string
strcat requires that there already be a null-terminated string in the destination to concatenate the source string onto; you're calling it with uninitialised values in the destination.
It looks like you want strcpy in every case, not just the first.
Also, remove the bogus TeamList[5][7]= '\0';. Even if you fix it to write inside the array bounds, each string has already been terminated by strcpy so there's no need to try to do that yourself.
Then stop messing around with low-level arrays and pointers. std::vector<std::string> would be much friendlier.

Difference between string and char[] types in C++

For C, we use char[] to represent strings.
For C++, I see examples using both std::string and char arrays.
#include <iostream>
#include <string>
using namespace std;
int main () {
string name;
cout << "What's your name? ";
getline(cin, name);
cout << "Hello " << name << ".\n";
return 0;
}
#include <iostream>
using namespace std;
int main () {
char name[256];
cout << "What's your name? ";
cin.getline(name, 256);
cout << "Hello " << name << ".\n";
return 0;
}
(Both examples adapted from http://www.cplusplus.com.)
What is the difference between these two types in C++? (In terms of performance, API integration, pros/cons, ...)
A char array is just that - an array of characters:
If allocated on the stack (like in your example), it will always occupy eg. 256 bytes no matter how long the text it contains is
If allocated on the heap (using malloc() or new char[]) you're responsible for releasing the memory afterwards and you will always have the overhead of a heap allocation.
If you copy a text of more than 256 chars into the array, it might crash, produce ugly assertion messages or cause unexplainable (mis-)behavior somewhere else in your program.
To determine the text's length, the array has to be scanned, character by character, for a \0 character.
A string is a class that contains a char array, but automatically manages it for you. Most string implementations have a built-in array of 16 characters (so short strings don't fragment the heap) and use the heap for longer strings.
You can access a string's char array like this:
std::string myString = "Hello World";
const char *myStringChars = myString.c_str();
C++ strings can contain embedded \0 characters, know their length without counting, are faster than heap-allocated char arrays for short texts and protect you from buffer overruns. Plus they're more readable and easier to use.
However, C++ strings are not (very) suitable for usage across DLL boundaries, because this would require any user of such a DLL function to make sure he's using the exact same compiler and C++ runtime implementation, lest he risk his string class behaving differently.
Normally, a string class would also release its heap memory on the calling heap, so it will only be able to free memory again if you're using a shared (.dll or .so) version of the runtime.
In short: use C++ strings in all your internal functions and methods. If you ever write a .dll or .so, use C strings in your public (dll/so-exposed) functions.
Arkaitz is correct that string is a managed type. What this means for you is that you never have to worry about how long the string is, nor do you have to worry about freeing or reallocating the memory of the string.
On the other hand, the char[] notation in the case above has restricted the character buffer to exactly 256 characters. If you tried to write more than 256 characters into that buffer, at best you will overwrite other memory that your program "owns". At worst, you will try to overwrite memory that you do not own, and your OS will kill your program on the spot.
Bottom line? Strings are a lot more programmer friendly, char[]s are a lot more efficient for the computer.
Well, string type is a completely managed class for character strings, while char[] is still what it was in C, a byte array representing a character string for you.
In terms of API and standard library everything is implemented in terms of strings and not char[], but there are still lots of functions from the libc that receive char[] so you may need to use it for those, apart from that I would always use std::string.
In terms of efficiency of course a raw buffer of unmanaged memory will almost always be faster for lots of things, but take in account comparing strings for example, std::string has always the size to check it first, while with char[] you need to compare character by character.
I personally do not see any reason why one would like to use char* or char[] except for compatibility with old code. std::string's no slower than using a c-string, except that it will handle re-allocation for you. You can set it's size when you create it, and thus avoid re-allocation if you want. It's indexing operator ([]) provides constant time access (and is in every sense of the word the exact same thing as using a c-string indexer). Using the at method gives you bounds checked safety as well, something you don't get with c-strings, unless you write it. Your compiler will most often optimize out the indexer use in release mode. It is easy to mess around with c-strings; things such as delete vs delete[], exception safety, even how to reallocate a c-string.
And when you have to deal with advanced concepts like having COW strings, and non-COW for MT etc, you will need std::string.
If you are worried about copies, as long as you use references, and const references wherever you can, you will not have any overhead due to copies, and it's the same thing as you would be doing with the c-string.
One of the difference is Null termination (\0).
In C and C++, char* or char[] will take a pointer to a single char as a parameter and will track along the memory until a 0 memory value is reached (often called the null terminator).
C++ strings can contain embedded \0 characters, know their length without counting.
#include<stdio.h>
#include<string.h>
#include<iostream>
using namespace std;
void NullTerminatedString(string str){
int NUll_term = 3;
str[NUll_term] = '\0'; // specific character is kept as NULL in string
cout << str << endl <<endl <<endl;
}
void NullTerminatedChar(char *str){
int NUll_term = 3;
str[NUll_term] = 0; // from specific, all the character are removed
cout << str << endl;
}
int main(){
string str = "Feels Happy";
printf("string = %s\n", str.c_str());
printf("strlen = %d\n", strlen(str.c_str()));
printf("size = %d\n", str.size());
printf("sizeof = %d\n", sizeof(str)); // sizeof std::string class and compiler dependent
NullTerminatedString(str);
char str1[12] = "Feels Happy";
printf("char[] = %s\n", str1);
printf("strlen = %d\n", strlen(str1));
printf("sizeof = %d\n", sizeof(str1)); // sizeof char array
NullTerminatedChar(str1);
return 0;
}
Output:
strlen = 11
size = 11
sizeof = 32
Fee s Happy
strlen = 11
sizeof = 12
Fee
Think of (char *) as string.begin(). The essential difference is that (char *) is an iterator and std::string is a container. If you stick to basic strings a (char *) will give you what std::string::iterator does. You could use (char *) when you want the benefit of an iterator and also compatibility with C, but that's the exception and not the rule. As always, be careful of iterator invalidation. When people say (char *) isn't safe this is what they mean. It's as safe as any other C++ iterator.
Strings have helper functions and manage char arrays automatically. You can concatenate strings, for a char array you would need to copy it to a new array, strings can change their length at runtime. A char array is harder to manage than a string and certain functions may only accept a string as input, requiring you to convert the array to a string. It's better to use strings, they were made so that you don't have to use arrays. If arrays were objectively better we wouldn't have strings.