Pointing to a some characters in a string using pointer - c++

#include<iostream>
using namespace std;
int main()
{
char s1[80]={"This is a developed country."};
char *s2[8];
s2[0]=&s1[10];
cout<<*s2; //Predicted OUTPUT: developed
// Actual OUTPUT: developed country.
return 0;
}
I want that the cout<<*s2; should print only the letters {"developed"} in it, so I gave *s2[8] length as 8 characters. What can I do so that the variable cout<<*s2 will only print upto the length of 8 characters. I'm using dmc, lcc and OpenWatcom compilers. This is only a small part of other bigger program where I'm using string data type, so what can I do now, well extremely thanks for answering my question :)

s2 is a length 8 array of pointers to char. You are making its first element point to s1 starting at position 10. That is all. You are not using the remaining elements of that array. Therefore the length of s2 is irrelevant.
You could have done this instead:
char* s2 = &s1[10];
If you want to create a string out of part of s1, you can use std::string:
std::string s3(s1+10, s1+19);
std::cout << s3 << endl;
Note that this allocates its own memory buffer and holds a copy or the original character sequence. If you only want a view of part of another string, you can easily implement a class holding a begin and one-past the end pointer to the original. Here's a rough sketch:
struct string_view
{
typedef const char* const_iterator;
template <typename Iter>
string_view(Iter begin, Iter end) : begin(begin), end(end) {}
const_iterator begin;
const_iterator end;
};
std::ostream& operator<<(std::ostream& o, const string_view& s)
{
for (string_view::const_iterator i = s.begin; i != s.end; ++i)
o << *i;
return o;
}
then
int main()
{
char s1[] = "This is a developed country.";
string_view s2(s1+10, s1+19);
cout << s2 << endl;
}

s2 is an array of pointers to char*. You are only ever using the zeroth element in this array.
&s1[10] points to the 11th character in the string s1. That address is assigned to the zeroth element of s2.
In the cout statement, *s2 is equivalent to s2[0];. So cout << *s2; outputs the zeroth element of s2, which has been assigned to the 11th character of s1. cout will trundle along the memory until the null-terminator of your string is reached.

Strings must be NULL terminated aka \0. The start of s2 is fine but cout will continue reading until the end. You have to actually copy the data rather than simply setting the pointer if you want to be able to output.

Your mistake is thinking that
char *s2[8];
declares a pointer to an array of 8 characters (or, not equivalently, a pointer to a string-with-exactly-8-characters). It doesn't do either of those. Instead of declaring a pointer-to-an-array, it declares an array-of-pointers.
If you want s2 to be a pointer to an array-of-8-characters, you need:
char (*s2)[8];
But, that's still messed up. You ask:
What can I do so that the variable *s2 will store only up to its length?
Do you think its length is 8? Before trying to answer that, return to your definition of s1:
char s1[80]={"This is a developed country."};
Is the length 80, or 28? The answer is either, depending on how you define 'length' - the length of the array or the length up to the null terminator?
All of these misconceptions about size are unhelpful. As #n.m. has pointed out in a comment, the solution to all pointer problems in C++ is to stop using pointers. (Apologies if I've mis-paraphrased n.m.!)
#include<iostream>
using namespace std;
int main()
{
string s1="This is a developed country.";
string s2;
s2 = s1.substr(10, 9);
cout << s2;
return 0;
}

If you want to do it ghetto style and skip the std::string for some reason you can always use strncpy, memcpy or strstr etc.
int main()
{
char s1[80]="This is a developed country.";
char s2[10];
strncpy(s2,s1+10,9);
s2[9] = '\0';
std::cout << s2 << std::endl;
std::cin.get();
return 0;
}

s2 is an char type arry, the element of the arry is char *,so you can't use it to store a string. if you want to get the "developed" in the strings,you can write code like it:
#include<iostream>
using namespace std;
int main()
{
char *s1[]={"This", "is", "a", "developed", "country."};
char *s2[8];
s2[0]= s1 + 3;
cout<<s2[0]; //Predicted OUTPUT: developed
// Actual OUTPUT: developed country.
return 0;
}

Related

I am facing an issue with string and null characters in C++.Null character is behaving differently

I am facing an issue with string and null characters in C++.
When I am writing '\0' in between the string and printing the string then I am getting only part before '\0' but on the other hand when I am taking string as input then changing any index as '\0' then it is printing differently. Why is it so and why the sizeof(string) is 32 in both the cases
Code is below for reference. Please help.
First code:
#include<iostream>
using namespace std;
int main(){
string s = "he\0llo";
cout<<s.length()<<"\n";
cout<<s<<endl;
cout<<sizeof(s)<<"\n";
}
output of First code:
2\n
he\n
32\n
Second code
#include<iostream>
using namespace std;
int main(){
string s;
cin>>s;
s[1] = '\0';
cout<<s<<"\n";
cout<<s.length()<<"\n";
cout<<sizeof(s)<<"\n";
return 0;
}
output of second code:
hllo\n
5\n
32\n
Below is the image for your reference.
std::string's implicit const CharT* constructors and assignment operator don't know the exact length of the string argument. Instead, it only knows that this is a const char* and is forced to assume that the string will be null-terminated, and thus computes the length using std::char_traits::length(...) (effectively std::strlen).
As a result, constructing a std::string object with an expression like:
std::string s = "he\0llo";
will compute 2 as the length of the string, since it assumes the first \0 character is the null terminator for the string, whereas your second example of:
s[1] = '\0';
is simply adding a null character into an already constructed string -- which does not change the size of the string.
If you want to construct a string with a null character in the middle, you can't let it compute the length of the string for you. Instead, you will have to construct the std::string and give it the length in some other way. This could either by done with the string(const char*, size_t) constructor, or with an iterator pair if this is an array:
// Specify the length manually
std::string s{"he\0llo",6};
Live Example
// Using iterators to a different container (such as an array)
const char c_str[] = "he\0llo";
std::string s{std::begin(c_str), std::end(c_str)};
Live Example
Note: sizeof(s) is telling you the size of std::string class itself in bytes for your implementation of the standard library. This does not tell you the length of the contained string -- which can be determined from either s.length() or s.size().
As of c++14, there is an option to specify quoted strings as a std::string by using std::literals. This prevents conversion of an array of chars to string which automatically stops at the first nul character.
#include<iostream>
#include <string>
using namespace std;
using namespace std::literals;
int main() {
string s = "he\0llo"s; // This initializes s to the full 6 char sequence.
cout << s.length() << "\n";
cout << s << endl;
cout << sizeof(s) << "\n"; // prints size of the s object, not the size of its contents
}
Results:
6
hello
28

Why pointer is returning String and not the first char since pointer is storing address of first char only?

I am new to C++. As per my knowledge in the above case names[0] should be the index of 'R'.
I guess I am missing out or lacking knowledge.
Please help me.
#include<iostream>
using namespace std;
int main() {
char *names[] = {
"Rohan",
"Sammy",
"Samuel",
"Henil"
};
// Expected R to be printed and not Rohan
cout << *names << endl;
return 0;
}
Output:
Rohan
You just need to put brackets *names[0]
[CODE]:
#include<iostream>
#include<string>
using namespace std;
int main(){
char* names[] = {
"Rohan",
"Sammy",
"Samuel",
"Henil"
};
// Expected R to be printed and not Rohan
cout<<*names[0]<<endl;
return 0;
}
[RESULT]:
R
Live demo here!
As per my knowledge of in the above case names[0] should be the index of 'R'
names[0] is not an index. It is a pointer. It points to the char R within the string literal.
When a pointer is inserted into a character stream, the behaviour is not to stream the pointed object.
When you insert a pointer to char into a character stream, it is assumed to be a pointer to null terminated string and the behaviour is to streamed the entire string. You should adjust your expectations accordingly.
P.S. The program is ill-formed since C++11 because string literals are not implicitly convertible to char*.
Referring to an array by name will decay to a pointer to the 1st element of the array. So, names is the same as &names[0], and thus *names is the same as names[0], which is the 1st char* element in the array, not the 1st char of the 1st string in the array.

Why printing the array of strings does print first characters only?

Please explain the difference in the output of two programs.
cout << branch[i] in first program gives output as:
Architecture
Electrical
Computer
Civil
cout << *branch[i] in second program gives output as:
A
E
C
C
Why?
What is the logic behind *branch[i] giving only first character of each word as output and branch[i] giving full string as an output?
Program 1
#include <iostream>
using namespace std;
int main()
{
const char *branch[4] = { "Architecture", "Electrical", "Computer", "Civil" };
for (int i=0; i < 4; i++)
cout << branch[i] << endl;
system("pause");
return 0;
}
Program 2
#include <iostream>
using namespace std;
int main()
{
const char *branch[4] = { "Architecture", "Electrical", "Computer", "Civil" };
for (int i=0; i < 4; i++)
cout << *branch[i] << endl;
system("pause");
return 0;
}
When you declare a const char* with assignment operator, for example:
const char* some_string = "some text inside";
What actually happens is the text being stored in the special, read-only memory with added the null terminating char after it ('\0'). It happens the same when declaring an array of const char*s. Every single const char* in your array points to the first character of the text in the memory.
To understand what happens next, you need to understand how does std::cout << work with const char*s. While const char* is a pointer, it can point to only on thing at a time - to the beginning of your text. What std::cout << does with it, is it prints every single character, including the one that is being pointed by mentioned pointer until the null terminating character is encountered. Thus, if you declare:
const char* s = "text";
std::cout << s;
Your computer will allocate read-only memory and assign bytes to hold "text\0" and make your s point to the very first character (being 't').
So far so good, but why does calling std::cout << *s output only a single character? That is because you dereference the pointer, getting what it points to - a single character.
I encourage you to read about pointer semantics and dereferencing a pointer. You'll then understand this very easily.
If, by any chance, you cannot connect what you have just read here to your example:
Declaring const char* branch[4]; you declare an array of const char*s. Calling branch[0] is replaced by *(branch + 0), which is derefecencing your array, which results in receiving a single const char*. Then, if you do *branch[0] it is being understood as *(*(branch + 0)), which is dereferencing a const char* resulting in receiving a single character.
branch[i] contains a char* pointer, which is pointing to the first char of a null-terminated string.
*branch[i] is using operator* to dereference that pointer to access that first char.
operator<< is overloaded to accept both char and char* inputs. In the first overload, it prints a single character. In the second overload, it outputs characters in consecutive memory until it reaches a null character.
This is because of operators precedences.
Subscript operator [] has a higher precedence than an indirection operator *.
So branch[i] returns const char * and *branch[i] returns const char.
*branch[i] prints a single char located at the address pointed to by branch[i].
branch[i] prints the whole char* array starting with the address pointed to by branch[i].

std::string stops at \0

I am having problems with std::string..
Problem is that '\0' is being recognized as end of the string as in C-like strings.
For example following code:
#include <iostream>
#include <string>
int main ()
{
std::string s ("String!\0 This is a string too!");
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
outputs this:
7
String!
What is the problem here? Shouldn't std::string treat '\0' just as any other character?
Think about it: if you are given const char*, how will you detemine, where is a true terminating 0, and where is embedded one?
You need to either explicitely pass a size of string, or construct string from two iterators (pointers?)
#include <string>
#include <iostream>
int main()
{
auto& str = "String!\0 This is a string too!";
std::string s(std::begin(str), std::end(str));
std::cout << s.size() << '\n' << s << '\n';
}
Example: http://coliru.stacked-crooked.com/a/d42211b7199d458d
Edit: #Rakete1111 reminded me about string literals:
using namespace std::literals::string_literals;
auto str = "String!\0 This is a string too!"s;
Your std::string really has only 7 characters and a terminating '\0', because that's how you construct it. Look at the list of std::basic_string constructors: There is no array version which would be able to remember the size of the string literal. The one at work here is this one:
basic_string( const CharT* s,
const Allocator& alloc = Allocator() );
The "String!\0 This is a string too!" char const[] array is converted to a pointer to the first char element. That pointer is passed to the constructor and is all information it has. In order to determine the size of the string, the constructor has to increment the pointer until it finds the first '\0'. And that happens to be one inside of the array.
If you happen to work with a lot zero bytes in your strings, then chances are that std::vector<char> or even std::vector<unsigned char> would be a more natural solution to your problem.
You are constructing your std::string from a string literal. String literals are automatically terminated with a '\0'. A string literal "f\0o" is thus encoded as the following array of characters:
{'f', '\0', 'o', '\0'}
The string constructor taking a char const* will be called, and will be implemented something like this:
string(char const* s) {
auto e = s;
while (*e != '\0') ++e;
m_length = e - s;
m_data = new char[m_length + 1];
memcpy(m_data, s, m_length + 1);
}
Obviously this isn't a technically correct implementation, but you get the idea. The '\0' you manually inserted will be interpreted as the end of the string literal.
If you want to ignore the extra '\0', you can use a std::string literal:
#include <iostream>
#include <string>
int main ()
{
using namespace std::string_literals;
std::string s("String!\0 This is a string too!"s);
std::cout << s.length(); // same result as with s.size()
std::cout << std::endl << s;
return 0;
}
Output:
30
String! This is a string too!
\0 is known as a terminating character so you'll need to skip it somehow.
Take that as an example.
So whenever you want to skip special characters you would like to use two backslashes "\\0"
And '\\0' is a two-character literal
std::string test = "Test\\0 Test"
Results :
Test\0 Test
Most beginners also make mistake when loading eg. files :
std::ifstream some_file("\new_dir\test.txt"); //Wrong
//You should be using it like this :
std::ifstream some_file("\\new_dir\\test.txt"); //Correct
In very few words, you're constructing your C++ string from a standard C string.
And standard C strings are zero-terminated. So, your C string parameter will be terminated in the first \0 character it can find. And that character is the one you explicitly provided in your string "String!\0 This is a string too!"
And not in the 2nd one that is implictly and automatically provided by the compiler in the end of your C standard string.
Escape your \0
std::string s ("String!\\0 This is a string too!");
and you will get what you need:
31
String!\0 This is a string too!
That's not a problem, that's the intended behavior.
Maybe you could elaborate why you have a \0 in your string.
Using a std::vector would allow you to use \0 in your string.

Inconsistency between std::string and string literals

I have discovered a disturbing inconsistency between std::string and string literals in C++0x:
#include <iostream>
#include <string>
int main()
{
int i = 0;
for (auto e : "hello")
++i;
std::cout << "Number of elements: " << i << '\n';
i = 0;
for (auto e : std::string("hello"))
++i;
std::cout << "Number of elements: " << i << '\n';
return 0;
}
The output is:
Number of elements: 6
Number of elements: 5
I understand the mechanics of why this is happening: the string literal is really an array of characters that includes the null character, and when the range-based for loop calls std::end() on the character array, it gets a pointer past the end of the array; since the null character is part of the array, it thus gets a pointer past the null character.
However, I think this is very undesirable: surely std::string and string literals should behave the same when it comes to properties as basic as their length?
Is there a way to resolve this inconsistency? For example, can std::begin() and std::end() be overloaded for character arrays so that the range they delimit does not include the terminating null character? If so, why was this not done?
EDIT: To justify my indignation a bit more to those who have said that I'm just suffering the consequences of using C-style strings which are a "legacy feature", consider code like the following:
template <typename Range>
void f(Range&& r)
{
for (auto e : r)
{
...
}
}
Would you expect f("hello") and f(std::string("hello")) to do something different?
If we overloaded std::begin() and std::end() for const char arrays to return one less than the size of the array, then the following code would output 4 instead of the expected 5:
#include <iostream>
int main()
{
const char s[5] = {'h', 'e', 'l', 'l', 'o'};
int i = 0;
for (auto e : s)
++i;
std::cout << "Number of elements: " << i << '\n';
}
However, I think this is very undesirable: surely std::string and string literals should behave the same when it comes to properties as basic as their length?
String literals by definition have a (hidden) null character at the end of the string. Std::strings do not. Because std::strings have a length, that null character is a bit superfluous. The standard section on the string library explicitly allows non-null terminated strings.
Edit
I don't think I've ever given a more controversial answer in the sense of a huge amount of upvotes and a huge amount of downvotes.
The auto iterator when applied to a C-style array iterates over each element of the array. The determination of the range is made at compile-time, not run time. This is ill-formed, for instance:
char * str;
for (auto c : str) {
do_something_with (c);
}
Some people use arrays of type char to hold arbitrary data. Yes, it is an old-style C way of thinking, and perhaps they should have used a C++-style std::array, but the construct is quite valid and quite useful. Those people would be rather upset if their auto iterator over a char buffer[1024]; stopped at element 15 just because that element happens to have the same value as the null character. An auto iterator over a Type buffer[1024]; will run all the way to the end. What makes a char array so worthy of a completely different implementation?
Note that if you want the auto iterator over a character array to stop early there is an easy mechanism to do that: Add a if (c == '0') break; statement to the body of your loop.
Bottom line: There is no inconsistency here. The auto iterator over a char[] array is consistent with how auto iterator work any other C-style array.
That you get 6 in the first case is an abstraction leak that couldn't be avoided in C. std::string "fixes" that. For compatibility, the behaviour of C-style string literals does not change in C++.
For example, can std::begin() and std::end() be overloaded for
character arrays so that the range they delimit does not include the
terminating null character? If so, why was this not done?
Assuming access through a pointer (as opposed to char[N]), only by embedding a variable inside the string containing the number of characters, so that seeking for NULL isn't required any more. Oops! That's std::string.
The way to "resolve the inconsistency" is not to use legacy features at all.
According to N3290 6.5.4, if the range is an array, boundary values are
initialized automatically without begin/end function dispatch.
So, how about preparing some wrapper like the following?
struct literal_t {
char const *b, *e;
literal_t( char const* b, char const* e ) : b( b ), e( e ) {}
char const* begin() const { return b; }
char const* end () const { return e; }
};
template< int N >
literal_t literal( char const (&a)[N] ) {
return literal_t( a, a + N - 1 );
};
Then the following code will be valid:
for (auto e : literal("hello")) ...
If your compiler provides user-defined literal, it might help to abbreviate:
literal operator"" _l( char const* p, std::size_t l ) {
return literal_t( p, p + l ); // l excludes '\0'
}
for (auto e : "hello"_l) ...
EDIT: The following will have smaller overhead
(user-defined literal won't be available though).
template< size_t N >
char const (&literal( char const (&x)[ N ] ))[ N - 1 ] {
return (char const(&)[ N - 1 ]) x;
}
for (auto e : literal("hello")) ...
If you wanted the length, you should use strlen() for the C string and .length() for the C++ string. You can't treat C strings and C++ strings identically--they have different behavior.
The inconsistency can be resolved using another tool in C++0x's toolbox: user-defined literals. Using an appropriately-defined user-defined literal:
std::string operator""s(const char* p, size_t n)
{
return string(p, n);
}
We'll be able to write:
int i = 0;
for (auto e : "hello"s)
++i;
std::cout << "Number of elements: " << i << '\n';
Which now outputs the expected number:
Number of elements: 5
With these new std::string literals, there is arguably no more reason to use C-style string literals, ever.