Why strtok_r() only accepts character array but not character pointer - c++

When I tried to compile the below code it compiles fine.
char str[] = "I am Lokesh Kumar. But I liked to be called The Loki";
char *token;
char *p = str;
while (token = strtok_r(p, " ", &p)) {
cout << token << endl;
}
But a error
"[Error] cannot convert 'char ()[53]' to 'char**' for argument '3' to 'char* strtok_r(char*, const char*, char**)*"
popped out for below code
char str[] = "I am Lokesh Kumar. But I liked to be called The Loki";
char *token;
char *p = str;
while (token = strtok_r(str, " ", &str)) {
cout << token << endl;
}
str and p both holds the address of first character element, then why this error

strtok_r takes the address of a char pointer as its third argument. It will update this pointer to point past the matched token (or some other internal use). str is not a pointer, it is an array of char, you cannot pass its address to strtok_r because the address of an array is not the same thing as the address of a pointer.
The confusion come from the automatic conversion of array objects to pointers to their first elements that occurs when an array is used in most expression contexts, such as p = str.
Arrays and pointers are very different things, just like families and individual names. A family (array) is a collection of people (characters), a full name (pointer) points to an individual. A pointer to pointer to character is similar to a piece of paper on which you can write the name of a individual person (character in the sense of a person ;).
Not also these points:
p does not need to be initialized before passing its address to strtok_r with a non NULL first argument;
p should not be passed to strtok_r as the first argument in subsequent calls,
It is considered poor coding style to use an assignment expression as a test expression in a conditional statement, you should parenthesize the assignment and compare the value to NULL explicity.
Here is a corrected version:
char str[] = "I am Lokesh Kumar. But I liked to be called The Loki";
char *token;
char *arg = str;
char *p;
while ((token = strtok_r(arg, " ", &p)) != NULL) {
printf("%s\n", token); // using printf since you tagged the question as C
arg = NULL;
}

str is an array, p is a pointer. It's not true that
str and p both holds the address of first character element
What is true is that str is convertable to the address of the first character. But you are doing this &str so you get the address of the array, not the address of the first character.
Since strtok_r requires a modifiable pointer to a character there's no way to get that other than to declare a pointer variable. Unless you decide to pass nullptr to the third parameter of course.

p has type char* so &p has type char**, as required.
str has type char ()[53], so &str has type char (*)[53].
The array decays to a pointer to the first element in many contexts, but it's not the same thing.
Anyway, you know the 3rd parameter to strtok_r can be NULL to start? It's only output for the first call.

This is a generic example usage of strtok_r. Though you already got detailed information about char array and pointer difference from #john answer.
Just for the simplicity sake, I am adding this example. I don't know why you want the pointer to point str.
You can delimit the string into token in below simple way.
There are two ways to solve the issue.
char str[] = "I am Lokesh Kumar. But I liked to be called The Loki";
cahr *p;
char *token = strtok_r(str, " ", &p);

Related

If char *name = "Emma", why is &name not the same as &name[0]?

I've been following CS50 and I'm pretty sure I keep hearing that a pointer is just an address
And that the address of an array is just the address of the first item in that array
But when I run the following code:
// we can also just create strings we need by using char * like this
char *name = "Emma 123";
char *another_string = "4";
printf("This is the string we get from name: %s\n", name);
// Print the value of the pointer pointing to name
printf("This is the pointer we get from name: %p\n", name);
// Now print out the address of the first letter in Emmas name
printf("This is the pointer we get by resolving &name: %p\n", &name);
printf("This is the pointer we get by resolving &name[0]: %p\n", &name[0]);
printf("Hmmmm. There is something different between &name and &name[0]. Somehow, I thought they should be the same ")
I get this as the output:
This is the string we get from name: Emma 123
This is the pointer we get from name: 0x4007ad
This is the pointer we get by resolving &name: 0x7ffd2d7d5468
This is the pointer we get by resolving &name[0]: 0x4007ad
I've been reading background articles but somehow, I can't seem to grasp why &name and &name[0] aren't the same
edit: I removed the c++ tag as I had incorrectly assumed these elements were common to c and c++
edit: I then added it back as it appears someone else did the same then decided it was a bad idea. Trying not to offend people
char *name = "Emma 123";
name stores the address of first element ('E').
&name is the address of name. It is a pointer to pointer.
&name[0] = &(*(name+0)) = (address of first element) = name
&name is the memory location of the pointer to your array.
&name[0] is the memory location of the first char in your array.
Note: as this is a multi-tagged C/C++ question: this answer applies to C++
From Member access operators:
Built-in subscript operator provides access to an object pointed-to by the pointer or array operand. [...]
The built-in subscript expression E1[E2] is exactly identical to the expression *(E1 + E2) [except evaluation order (since C++17)], that is, the pointer operand (which may be a result of array-to-pointer conversion, and which must point to an element of some array or one past the end) is adjusted to point to another element of the same array, following the rules of pointer arithmetics, and is then dereferenced.
meaning &name[0], which is &(name[0]) is equivalent to &(*name).
Also note that string literals should be constants (and your compiler should either reject your program, or emit a warning about non-conformant C++). Corrected example (C++, not C):
#include <iostream>
int main() {
const char * name = "Emma 123";
std::cout << name << "\n"; // Emma 123
std::cout << &name << "\n"; // 0x7ffd2d7d5468
std::cout << &(*name) << "\n"; // Emma 123
std::cout << &(name[0]) << "\n"; // Emma 123
}
As name is a pointer, &name, which takes the address of the pointer, is a pointer to a pointer.
#include <type_traits>
const char * name = "Emma 123";
static_assert(std::is_same_v<decltype(&name), const char **>);
static_assert(std::is_same_v<decltype(&(*name)), const char *>);
static_assert(std::is_same_v<decltype(&(name[0])), const char *>);
Now, whilst std::basic_ostream<CharT,Traits>::operator<< has a overload with a function parameter const void *, it has a non-member overload or for const char*
Character and character string arguments (e.g., of type char or const char*) are handled by the non-member overloads of operator<<.
Thus, for const char* arguments, the non-member overloads will chosen, which will print the characters from the character array whose first element is pointed to by the const char* pointer.
For const char**, however, the member const void* will be chosen:
std::cout << &name << "\n"; // 0x7ffd2d7d5468
std::cout << static_cast<const void*>(&name) << "\n"; // 0x7ffd2d7d5468
For C++ (I don't know C, but I do know that this code is different whether it is C or C++):
String literals are constant. It should be
const char *name = "Emma 123";
That aside, name is already the pointer to the first element in the array. You get expected output with:
#include <iostream>
int main(){
const char* name = "Emma";
std::cout << static_cast<const void*>(name) <<"\n";
std::cout << static_cast<const void*>(&name[0]) << "\n";
}
Possible output:
0x402005
0x402005
The static cast is needed to bybass the ostream::operator<< for char* that prints the string.
&name is the address of where name is stored, not the address stored in name.
This answer applies to both C and C++, with the exception of the paragraph that says "in C++".
why is &name not the same as &name[0]?
Because the address where the pointer is stored is separate from the address where the pointer points to, which is where the array is.
And that the address of an array is just the address of the first item in that array
You seem to be conflating arrays and pointers. An array is not a pointer and pointer is not an array. The quoted sentence is about arrays and it doesn't apply to pointers.
char *name = "Emma 123";
This is ill-formed in C++ (since C++11). String literal is not convertible to a pointer to non-const char. To fix this, use const char*.
printf("This is the pointer we get from name: %p\n", &name);
printf("This is the pointer we get by resolving &name: %p\n", &name);
The behaviour of this program is undefined. You must pass arguments of correct type, and the correct type for %p is void* while you passed char* and char**. A corrected example:
const void* v_name = name;
void* v_name_a = &name;
printf("This is the pointer we get from name: %p\n", v_name);
printf("This is the pointer we get by resolving &name: %p\n", &v_name_a);

char *newx=p+strlen(str1); how this will execute because strstr will return the pointer to the first match character of the another string

I can not understand how this code will execute. As far as I know, strstr() will return a pointer to the first letter of the matching string. So, how can we do char *newx=p+strlen(str1); when p is a pointer and strlen() returns an integer value?
p=strstr(str2,str1);
if(p){
char *newx=p+strlen(str1);
strcpy(t,newx);
}
This is simple pointer arithmetic.
Adding an integer to a pointer increments the pointer by the specified number of elements.
Say you have a pointer T *ptr. When you do something like this:
T *ptr2 = ptr + N;
The compiler actually does (an optimized) equivalent of this:
T *ptr2 = reinterpret_cast<T*>(reinterpret_cast<uintptr_t>(ptr) + (sizeof(T) * N));
So, the code in question is using strstr() to search the str2 string for a substring str1 and get a pointer to that substring. If the substring is found, that pointer is then incremented via strlen() to the address of the character immediately following the end of the substring, and then strcpy() is used to copy the remaining text from that address into the t string.
For example:
const char *str2 = "Hello StackOverflow";
const char *str1 = "Stack";
const char *p;
char t[10];
p = strstr(str2, str1); // p points to str2[6]...
if (p) { // substring found?
const char *newx = p + strlen(str1); // newx points to str2[11]...
strcpy(t, newx); // copies "Overflow"
}
This is a question regarding pointer arithmatic. I suggest you read https://www.tutorialspoint.com/cplusplus/cpp_pointer_arithmatic.htm for more info on pointer arithmatic.
For this question, the clarification is, when you add the size of the str1 to newx, it automatically identifies that you're incrementing a char pointer and hence the address needs to be incremented by sizeof(char) * strlen(str1)
Hope this clarifies on your issue and highly recommend you to read through pointer arithmatic.

Initialization of pointers in c++

I need to clarify my concepts regarding the basics of pointer initialization in C++. As per my understanding, a pointer must be assigned an address before putting some value using the pointer.
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This would probably show the correct output (10) but this may cause issue in larger programs since p initially had garbage address which can be anything & may later be used somewhere else in the program as well.So , I believe this is incorrrect, the correct way is:
int *p;
int x=10;
p=&x; //appropriate
cout << *p <<"\n";
My question is, if the above understanding is correct, then does the same apply on char* as well?:
const char *str="hello"; // inappropriate
cout << str << "\n";
//OR
const string str1= "hello";
const char str2[6] ="world";
const char *str=str1; //appropriate
const char *st=str2; //appropriate
cout << str << st << "\n";
Please advice
Your understanding of strings is incorrect.
Lets take for example the very first line:
const char *str="hello";
This is actually correct. A string literal like "hello" is turned into a constant array by the compiler, and like all arrays it can decay to a pointer to its first element. So what you are doing is making str point to the first character of the array.
Then lets continue with
const string str1= "hello";
const char *str=str1;
This is actually wrong. A std::string object have no casting operator defined to cast to a const char *. The compiler will give you an error for this. You need to use the c_str function go get a pointer to the contained string.
Lastly:
const char str2[6] ="world";
const char *st=str2; //appropriate
This is really no different than the first line when you declare and initialize str. This is, as you say, "appropriate".
About that first example with the "inappropriate" pointer:
int *p;
*p=10; //inappropriate
cout << *p <<"\n";
This is not only "inappropriate", this leads to undefined behavior and may actually crash your program. Also, the correct term is that the value of p is indeterminate.
When I declare a pointer
int *p;
I get an object p whose values are addresses. No ints are created anywhere. The thing you need to do is think of p as being an address rather than being an int.
At this point, this isn't particularly useful since you have no addresses you could assign to it other than nullptr. Well, technically that's not true: p itself has an address which you can get with &p and store it in an int**, or even do something horrible like p = reinterpret_cast<int*>(&p);, but let's ignore that.
To do something with ints, you need to create one. e.g. if you go on to declare
int x;
you now have an int object whose values are integers, and we could then assign its address to p with p = &x;, and then recover the object from p via *p.
Now, C style strings have weird semantics — the weirdest aspect being that C doesn't actually have strings at all: it's always working with arrays of char.
String literals, like "Hello!", are guaranteed to (act1 like they) exist as an array of const char located at some address, and by C's odd conversion rules, this array automatically converts to a pointer to its first element. Thus,
const char *str = "hello";
stores the address of the h character in that character array. The declaration
const char str2[6] ="world";
works differently; this (acts1 like it) creates a brand new array, and copies the contents of the string literal "world" into the new array.
As an aside, there is an obsolete and deprecated feature here for compatibility with legacy programs, but for some misguided reason people still use it in new programs these days so you should be aware of it and that it's 'wrong': you're allowed to break the type system and actually write
char *str = "hello";
This shouldn't work because "hello" is an array of const char, but the standard permits this specific usage. You're still not actually allowed to modify the contents of the array, however.
1: By the "as if" rule, the program only has to behave as if things happen as I describe, but if you peeked at the assembly code, the actual way things happen can be very different.

C++ char pointer

Why does the following happen?
char str[10]="Pointers";
char *ptr=str;
cout << str << "\n"; // Output : Pointers
int abc[2] = {0,1 };
int *ptr1 = abc;
cout <<ptr1 << "\n"; // But here the output is an address.
// Why are the two outputs different?
As others have said, the reason for the empty space is because you asked it to print out str[3], which contains a space character.
Your second question seems to be asking why there's a difference between printing a char* (it prints the string) and int* (it just prints the address). char* is treated as a special case, it's assumed to represent a C-style string; it prints all the characters starting at that address until a trailing null byte.
Other types of pointers might not be part of an array, and even if they were there's no way to know how long the array is, because there's no standard terminator. Since there's nothing better to do for them, printing them just prints the address value.
1) because str[3] is a space so char * ptr = str+3 points to a space character
2) The << operator is overloaded, the implementation is called depending on argument type:
a pointer to an int (int*) uses the default pointer implementation and outputs the formatted address
a pointer to a char (char*) is specialized, output is formated as a null terminated string from the value it points to. If you want to output the adress, you must cast it to void*
The empty space is actually Space character after "LAB". You print the space character between "LAB" and "No 5".
Your second question: You see address, because ptr1 is actually address (pointer):
int *ptr1;
If you want to see it's first member (0), you should print *ptr1

using strings as a pointer to its first character

int main()
{
char name[]="avinash";
const char* nameano="a";
strtok(name,"n");
cout<<"the size of name is"<< sizeof(name);
cout<< name;
}
strtok takes in arguments (char*, const char*); name is an array, and hence a pointer to its first element. But if we make a declaration like
string name="avinash";
and pass name as first argument to strtok, then the program doesn't work, but it should, because name, a string, is a pointer to its first character.
Also, if we write
const string n = "n";
and pass it as second argument it doesn't work; this was my first problem.
Now also the sizeof(name) output is 8, but it should be 4, as avinash has been tokenized. Why does this happen?
You are confusing several things.
strtok takes in arguments (char*, const char*)....name is an array and hence a pointer to its first element...
name is an array, and it's not a pointer to its first element. An array decays in a pointer to its first argument in several contexts, but in principle it's a completely different thing. You notice this e.g. when you apply the sizeof operator on a pointer and on an array: on an array you get the array size (i.e. the cumulative size of its elements), on a pointer you get the size of a pointer (which is fixed).
but if we made a declaration like string name="avinash" and passed name as argument
then the prog doesnt work but it should because name of string is a pointer to its first character...
If you make a declaration like
string name="avinash";
you're are saying a completely different thing; string here is not a C-string (i.e. a char[]), but the C++ std::string type, which is a class that manages a dynamic string; those two things are completely different.
If you want to obtain a constant C-string (const char *) from a std::string you have to use it's c_str() method. Still, you can't use the pointer obtained in this way with strtok, since c_str() returns a pointer to a const C-string, i.e. it cannot be modified. Notice that strtok is not intended to work with C++ strings, since it's part of the legacy C library.
also if we write const string n = "n"; and pass it as second argument it doesnt work...this was my first problem...
This doesn't work for the exact same motivation, but in this case you can simply use the c_str() method, since the second argument of strtok is a const char *.
now also the sizeof(name) output is 8 but it should be 4 as avinash has been tokenised..
sizeof returns the "static" size of its operand (i.e. how much memory is allocated for it), it knows nothing about the content of name. To get the length of a C-string you have to use the strlen function; for C++ std::string just use its size() method.
I think you have to pass your name[] array to strtok like this:
strtok(&name[0],"n");
First of all, strtok is C and not C++ and is not made to work with C++ strings. Plus, it cant work if you dont use the return result of the function.
char name[]="avinash";
char * tok_name = strtok(name,"n");
std::cout<<"the size of name is"<< sizeof(tok_name);
std::cout<< tok_name;
You should consider to use std::string::find, std::string::substr and other stuff from the STL library instead of C functions.
As you said, strtok takes arguments of type char * and const char *. A string is not a char * so passing it as the first argument will not compile. For the second argument, you can convert a string to a const char * using the c_str() member function.
What problem are you trying to solve? If you're just trying to learn how strtok works, it would be much better to stick to raw character arrays. That function is inherited from C and thus wasn't designed with C++ strings in mind.
I don't believe the actual size of the name array is ever changed when you call strtok. The call remembers the last location it was called if you pass it a null pointer, and it continues to tokenize the string. The return value of the strtok call is the token that has been found in the string provided. Your code is calling sizeof(name) which is never actually adjusted by the strtok function.
int main()
{
char name[] = "avinash";
const char* nameano = "a";
char* token;
token = strtok(name, "n");
cout << "the length of the TOKEN is" << strlen(token) << endl;
cout << token << endl;
cout << "the length of the string is" << strlen(name) << endl;
cout << name << endl;
}
Try this maybe? I am not in front of a compiler so its likely I made a mistake, but it should get you on the right track to solve this problem.
This might also help:
http://msdn.microsoft.com/en-us/library/2c8d19sb(v=vs.71).aspx