why char* passed to FUNCTION always with the len of the string - c++

i am learning c/c++ recently.but i don't understand the difference between
int a(chat* str,int len)
{
cout<<str<<len;
}
and
int a(char* str)
{
cout<<str<<strlen(str);
}

When you pass char* without a length, how would you know how many elements to process? char* means a pointer to a character. When you pass a pointer, you have no idea (and cannot find out) how much memory (if any) was allocated for the pointer.
That's why C-strings use are null-terminated (they end with a '\0' character), so you can detect their length by iterating the pointer. Hence, if you want to use a pointer without giving the length of its allocated memory, you need to obey some conventions. But in general, e.g. when passing a buffer, you shouldn't expect any end-signalling character, so in this case you need to pass the length, otherwise may end up reading/writing out of bounds.
For your particular example, you're fine with passing only a pointer provided you use your function only on C-strings, since strlen(str) uses this convention of counting until encountering a '\0'.
Buffer overflows are one the most messy and nightmarish programming errors, which can result in serious security issues. That's why you should try (whenever possible) to use std::string from the C++ standard library instead of C-style char* strings.

A C-String should always contain a termination character, we call it null character. It's technically 0 (not the number 0, but ASCII 0)
When we create a char* and initialize it with some text, it automatically adds the '\0' to the end.
char* c = "Hello";
This will create an array of char with six elements. Yes, six elements.
c = {'H', 'e', 'l', 'l', 'o', '\0'}
When you print c, it will search till it finds that '\0'. What if someone replaces it.
c[5] = '!';
Then the system can't determine the end of the text. Then it will keep on reading the memory (which does not belong to that variable, or may be even the program) until it hits a null char.
That is the main reason to pass the size (or number or chars to read) to a function.
On the other hand, if you need to read some data from a stream, you can use a buffer. In that case, you should specify how many bytes to read, in that way you will not cause buffer overflows.

Above answers are to the point. So I'm going to discuss other perspective behind of practise of passing length along with char *.
As others said, not always, the string pointed by char * end up with \0. Only when the string ends with \0 strlen() would actually work. There are certain use-cases for example binary coding, where data is represented as string. In such case, char * would not end with \0. Besides, there can be certain use-cases to read / write only up to certain length / size. In such case, it is always necessary to test whether the input length is within the range of length of total string. So as a common case, length has been passed explicitly, which can be used in any way as desired by the caller.

Related

sprintf() is adding an extra variable

Why is this happening?:
char buf[256];
char date[8];
sprintf(date, "%d%02d%02d", Time.year(), Time.month(), Time.day());
snprintf(buf, sizeof(buf), "{\"team\":\"%s\"}", team.c_str());
Serial.println(date);
output:
20180202{"team":"IND"}
it should only be: 20180202
I don't know why {"team":"IND"} is getting added to the end of it.
Very likely you declared two arrays and they are lined up in a way that allowed for the buf to overwrite the null terminator of date and thus it's "concatenating" the two.
I can't write code to reproduce this because it's undefined behavior and thus not reliable. But I can tell you how you can avoid it,
snprintf(date, sizeof(date), "%d%02d%02d", Time.year(), Time.month(), Time.day());
snprintf(buf, sizeof(buf), "{\"team\":\"%s\"}", team.c_str());
Having said that, why are you using snprintf() when this appears to be c++? And so there are more suitable solutions for this kind of problem.
This would print an incorrect value, but would not cause any unexpected behavior.
Strings in c are simply arrays with a special arrangement. If the string has n printable characters it should be stored in an array of size n + 1, so that you can add what is called a null terminator. It's a special value that indicates the end of the string.
Your second snprintf() is overwriting such null terminator of the date array and thus appearing to concatenate both strings.
You have reserved space to store exactly 8 chars:
char date[8];
To store the date properly 20180202 you need
char date[9];
because sprintf() puts the extra '\0' character to the buffer you pass for proper c-style string termination.
I'd suspect you declared your buffers like
char buffer[???];
char date[8];
since these are most likely stored on your processors stack, you need to read that backwards, thus the output placed at buffer overwrites that terminating '\0', and appears immediately after date.

Strncpy should only be used with fixed length arrays

According to this StackOverflow comment strncpy should never be used with a non-fixed length array.
strncpy should never be used unless you're working with fixed-width, not-necessarily-terminated string fields in structures/binary files. – R.. Jan 11 '12 at 16:22
I understand that it is redundant if you are dynamically allocating memory for the string but is there a reason why it would be bad to use strncpy over strcpy
strncpy will copy data up to the limit you specify--but if it reaches that limit before the end of the string, it'll leave the destination unterminated.
In other words, there are two possibilities with strncpy. One is that you get behavior precisely like strcpy would have produced anyway (except slower, since it fills the remainder of the destination buffer with NULs, which you virtually never actually want or care about). The other is that it produces a result you generally can't put to any real use.
If you want to copy a string up to a maximum length into a fixed-length buffer, you can (for example) use sprintf to do the job:
char buffer[256];
sprintf(buffer, "%255s", source);
Unlike strncpy, this always zero-terminates the result, so the result is always usable as a string.
If you don't want to use sprintf (or similar), I'd advise just writing a function that actually does what you want, something on this general order:
void copy_string(char const *dest, char const *source, size_t max_len) {
size_t i;
for (i=0; i<max_len-1 && source[i]; i++)
dest[i] = source[i];
dest[i] = '\0';
}
Since you've tagged this as C++ (in addition to C): my advice would be to generally avoid this whole mess in C++ by just using std::string.
If you really have to work with NUL-terminated sequences in C++, you might consider another possibility:
template <size_t N>
void copy_string(char const (&dest)[N], char const *source) {
size_t i;
for (i=0; i<N-1 && source[i]; i++)
dest[i] = source[i];
dest[i] = '\0';
}
This only works when the destination is an actual array (not a pointer), but for that case, it gets the compiler to deduce the size of the array, instead of requiring the user to pass it explicitly. This will generally make the code a tiny bit faster (less overhead in the function call) and much harder to screw up and pass the wrong size.
The argument against using strncpy is that it does not guarentee that your string will be null terminated.
The less error prone way to copy a string in C when using non-fixed length arrays is to use snprintf which does guarentee null termination of your string.
A good Blog Post Commenting on *n* functions.
These functions let you specify the size of the buffer but – and this is really important – they do not guarantee null-termination. If you ask these functions to write more characters than will fill the buffer then they will stop – thus avoiding the buffer overrun – but they will not null-terminate the buffer.
Which means that the use of strncpy and other such functions when not dealing with fixed arrays introduces unnessisary risk of non-null terminated strings which can be time-bombs in your code.
char * strncpy ( char * destination, const char * source, size_t num );
Limitations of strncpy():
It doesn't put a null-terminator on the destination string if it is completely filled. And, no null-character is implicitly appended at the end of destination if source is longer than num.
If num is greater than the length of source string, the destination string is padded with null characters up to num length.
Like strcpy, it is not a memory-safe operation. Because it does not check for sufficient space in destination before it copies source, it is a potential cause of buffer overruns.
Refer: Why should you use strncpy instead of strcpy?
We have 2 versions for copy string from one to another
1> strcpy
2> strncpy
These two versions is used for fixed and non-fixed length array. The strcpy don't check the upper bound for destination string when copy string, strncpy will check it. When the destination string is reached to this upper bound, the function strncpy will return error code, in the meantime the function strcpy will cause some effect in memory of the current process and terminate the process immediately. So that the strncpy is more secure than strcpy

Not sure why I am getting different lengths when using a string or a char

When I call gethostname using a char my length 25 but when I use a string my length is 64. Not really sure why. Both of them I am declaring the same size on HOST_NAME_MAX.
char hostname[HOST_NAME_MAX];
BOOL host = gethostname(hostname, sizeof hostname);
expectedComputerName = hostname;
int size2 = expectedComputerName.length();
std::string test(HOST_NAME_MAX, 0);
host = gethostname(&test[0], test.length());
int testSize = test.length();
An std::string object can contain NULs (i.e. '\0' characters). You are storing the name in the first bytes of a string object that was created with a size of HOST_NAME_MAX length.
Storing something in the beginning of the string data won't change the length of the string that remains therefore HOST_NAME_MAX.
When creating a string from a char pointer instead the std::string object created will contain up to, but excluding, the first NUL character (0x00). The reason is that a C string cannot contain NULs because the first NUL is used to mark the end of the string.
Consider what you're doing in each case. In the former code snippet, you're declaring a character array capable of holding HOST_NAME_MAX-1 characters (1 for the null terminator). You then load some string data into that buffer via the call to gethostname and then print out the length of buffer by assigning it to a std::string object using std::string::operator= that takes a const char *. One of the effects of this is that it will change an internal size variable of std::string to be strlen of the buffer, which is not necessarily the same as HOST_NAME_MAX. A call to std::string::length simply returns that variable.
In the latter case, you're using the std::string constructor that takes a size and initial character to construct test. This constructor sets the internal size variable to whatever size you passed in, which is HOST_NAME_MAX. The fact that you then copy in some data to std::strings internal buffer has no bearing on its size variable. As with the other case, a call to the length() member function simply returns the size - which is HOST_NAME_MAX - regardless of whether or not the actual length of the underlying buffer is smaller than HOST_NAME_MAX.
As #MattMcNabb mentioned in the comments, you could fix this by:
test.resize( strlen(test.c_str()) );
Why might you want to do this? Consistency with the char buffer approach might be a reason, but another reason may be performance oriented. In the latter case you're not only outright setting the length of the string to HOST_NAME_MAX, but also its capacity (omitting the SSO for brevity), which you can find starting on line 242 of libstdc++'s std::string implementation. What this means in terms of performance is that even though only, say, 25 characters are actually in your test string, the next time you append to that string (via +=,std::string::append,etc), it's more than likely to have to reallocate and grow the string, as shown here, because the internal size and internal capacity are equal. Following #MattMcNabb's suggestion, however, the string's internal size is reduced down to the length of the actual payload, while keeping the capacity the same as before, and you avoid the almost immediate re-growth and re-copy of the string, as shown here.

How to avoid providing length along with char*?

There is a function which sends data to the server:
int send(
_In_ SOCKET s,
_In_ const char *buf,
_In_ int len,
_In_ int flags
);
Providing length seems to me a little bit weird. I need to write a function, sending a line to the server and wrapping this one such that we don't have to provide length explicitly. I'm a Java-developer and in Java we could just invoke String::length() method, but now we're not in Java. How can I do that, unless providing length as a template parameter? For instance:
void sendLine(SOCKET s, const char *buf)
{
}
Is it possible to implement such a function?
Use std string:
void sendLine(SOCKET s, const std::string& buf) {
send (s, buf.c_str(), buf.size()+1, 0); //+1 will also transmit terminating \0.
}
On a side note: your wrapper function ignores the return value and doesn't take any flags.
you can retrieve the length of C-string by using strlen(const char*) function.
make sure all the strings are null terminated and keep in mind that null-termination (the length grows by 1)
Edit: My answer originally only mentioned std::string. I've now also added std::vector<char> to account for situations where send is not used for strictly textual data.
First of all, you absolutely need a C++ book. You are looking for either the std::string class or for std::vector<char>, both of which are fundamental elements of the language.
Your question is a bit like asking, in Java, how to avoid char[] because you never heard of java.lang.String, or how to avoid arrays in general because you never heard of java.util.ArrayList.
For the first part of this answer, let's assume you are dealing with just text output here, i.e. with output where a char is really meant to be a text character. That's the std::string use case.
Providing lenght seems to me a little bit wierd.
That's the way strings work in C. A C string is really a pointer to a memory location where characters are stored. Normally, C strings are null-terminated. This means that the last character stored for the string is '\0'. It means "the string stops here, and if you move further, you enter illegal territory".
Here is a C-style example:
#include <string.h>
#include <stdio.h>
void f(char const* s)
{
int l = strlen(s); // l = 3
printf(s); // prints "foo"
}
int main()
{
char* test = new char[4]; // avoid new[] in real programs
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
test[3] = '\0';
f(test);
delete[] test;
}
strlen just counts all characters at the specified position in memory until it finds '\0'. printf just writes all characters at the specified position in memory until it finds '\0'.
So far, so good. Now what happens if someone forgets about the null terminator?
char* test = new char[3]; // don't do this at home, please
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
f(test); // uh-oh, there is no null terminator...
The result will be undefined behaviour. strlen will keep looking for '\0'. So will printf. The functions will try to read memory they are not supposed to. The program is allowed to do anything, including crashing. The evil thing is that most likely, nothing will happen for a while because a '\0' just happens to be stored there in memory, until one day you are not so lucky anymore.
That's why C functions are sometimes made safer by requiring you to explicitly specify the number of characters. Your send is such a function. It works fine even without null-terminated strings.
So much for C strings. And now please don't use them in your C++ code. Use std::string. It is designed to be compatible with C functions by providing the c_str() member function, which returns a null-terminated char const * pointing to the contents of the string, and it of course has a size() member function to tell you the number of characters without the null-terminated character (e.g. for a std::string representing the word "foo", size() would be 3, not 4, and 3 is also what a C function like yours would probably expect, but you have to look at the documentation of the function to find out whether it needs the number of visible characters or number of elements in memory).
In fact, with std::string you can just forget about the whole null-termination business. Everything is nicely automated. std::string is exactly as easy and safe to use as java.lang.String.
Your sendLine should thus become:
void sendLine(SOCKET s, std::string const& line)
{
send(s, line.c_str(), line.size());
}
(Passing a std::string by const& is the normal way of passing big objects in C++. It's just for performance, but it's such a widely-used convention that your code would look strange if you just passed std::string.)
How can I do that, unless providing lenght as a template parameter?
This is a misunderstanding of how templates work. With a template, the length would have to be known at compile time. That's certainly not what you intended.
Now, for the second part of the answer, perhaps you aren't really dealing with text here. It's unlikely, as the name "sendLine" in your example sounds very much like text, but perhaps you are dealing with raw data, and a char in your output does not represent a text character but just a value to be interpreted as something completely different, such as the contents of an image file.
In that case, std::string is a poor choice. Your output could contain '\0' characters that do not have the meaning of "data ends here", but which are part of the normal contents. In other words, you don't really have strings anymore, you have a range of char elements in which '\0' has no special meaning.
For this situation, C++ offers the std::vector template, which you can use as std::vector<char>. It is also designed to be usable with C functions by providing a member function that returns a char pointer. Here's an example:
void sendLine(SOCKET s, std::vector<char> const& data)
{
send(s, &data[0], data.size());
}
(The unusual &data[0] syntax means "pointer to the first element of the encapsulated data. C++11 has nicer-to-read ways of doing this, but &data[0] also works in older versions of C++.)
Things to keep in mind:
std::string is like String in Java.
std::vector is like ArrayList in Java.
std::string is for a range of char with the meaning of text, std::vector<char> is for a range of char with the meaning of raw data.
std::string and std::vector are designed to work together with C APIs.
Do not use new[] in C++.
Understand the null termination of C strings.

Is there a safe version of strlen?

std::strlen doesn't handle c strings that are not \0 terminated. Is there a safe version of it?
PS I know that in c++ std::string should be used instead of c strings, but in this case my string is stored in a shared memory.
EDIT
Ok, I need to add some explanation.
My application is getting a string from a shared memory (which is of some length), therefore it could be represented as an array of characters. If there is a bug in the library writing this string, then the string would not be zero terminated, and the strlen could fail.
You've added that the string is in shared memory. That's guaranteed readable, and of fixed size. You can therefore use size_t MaxPossibleSize = startOfSharedMemory + sizeOfSharedMemory - input; strnlen(input, MaxPossibleSize) (mind the extra n in strnlen).
This will return MaxPossibleSize if there's no \0 in the shared memory following input, or the string length if there is. (The maximal possible string length is of course MaxPossibleSize-1, in case the last byte of shared memory is the first \0)
C strings that are not null-terminated are not C strings, they are simply arrays of characters, and there is no way of finding their length.
If you define a c-string as
char* cowSays = "moo";
then you autmagically get the '\0' at the end and strlen would return 3. If you define it like:
char iDoThis[1024] = {0};
you get an empty buffer (and array of characters, all of which are null characters). You can then fill it with what you like as long as you don't over-run the buffer length. At the start strlen would return 0, and once you have written something you would also get the correct number from strlen.
You could also do this:
char uhoh[100];
int len = strlen(uhoh);
but that would be bad, because you have no idea what is in that array. It could hit a null character you might not. The point is that the null character is the defined standard manner to declare that the string is finished.
Not having a null character means by definition that the string is not finished. Changing that will break the paradigm of how the string works. What you want to do is make up your own rules. C++ will let you do that, but you will have to write a lot of code yourself.
EDIT
From your newly added info, what you want to do is loop over the array and check for the null character by hand. You should also do some validation if you are expecting ASCII characters only (especially if you are expecting alpha-numeric characters). This assumes that you know the maximum size.
If you do not need to validate the content of the string then you could use one of the strnlen family of functions:
http://msdn.microsoft.com/en-us/library/z50ty2zh%28v=vs.80%29.aspx
http://linux.about.com/library/cmd/blcmdl3_strnlen.htm
size_t safe_strlen(const char *str, size_t max_len)
{
const char * end = (const char *)memchr(str, '\0', max_len);
if (end == NULL)
return max_len;
else
return end - str;
}
Yes, since C11:
size_t strnlen_s( const char *str, size_t strsz );
Located in <string.h>
Get a better library, or verify the one you have - if you can't trust you library to do what it says it will, then how the h%^&l do you expect your program to?
Thats said, Assuming you know the length of the buiffer the string resides, what about
buffer[-1+sizeof(buffer)]=0 ;
x = strlen(buffer) ;
make buffer bigger than needed and you can then test the lib.
assert(x<-1+sizeof(buffer));
C11 includes "safe" functions such as strnlen_s. strnlen_s takes an extra maximum length argument (a size_t). This argument is returned if a null character isn't found after checking that many characters. It also returns the second argument if a null pointer is provided.
size_t strnlen_s(const char *, size_t);
While part of C11, it is recommended that you check that your compiler supports these bounds-checking "safe" functions via its definition of __STDC_LIB_EXT1__. Furthermore, a user must also set another macro, __STDC_WANT_LIB_EXT1__, to 1, before including string.h, if they intend to use such functions. See here for some Stack Overflow commentary on the origins of these functions, and here for C++ documentation.
GCC and Clang also support the POSIX function strnlen, and provide it within string.h. Microsoft too provide strnlen which can also be found within string.h.
You will need to encode your string. For example:
struct string
{
size_t len;
char *data;
} __attribute__(packed);
You can then accept any array of characters if you know the first sizeof(size_t) bytes of the shared memory location is the size of the char array. It gets tricky when you want to chain arrays this way.
It's better to trust your other end to terminate it's strings or roll your own strlen that does not go outside the bounderies of the shared memory segment (providing you know at least the size of that segment).
If you need to get the size of shared memory, try to use
// get memory size
struct shmid_ds shm_info;
size_t shm_size;
int shm_rc;
if((shm_rc = shmctl(shmid, IPC_STAT, &shm_info)) < 0)
exit(101);
shm_size = shm_info.shm_segsz;
Instead of using strlen you can use shm_size - 1 if you are sure that it is null terminated. Otherwise you can null terminate it by data[shm_size - 1] = '\0'; then use strlen(data);
a simple solution:
buff[BUFF_SIZE -1] = '\0'
ofc this will not tell you if the string originally was exactly BUFF_SIZE-1 long or it was just not terminated... so you need xtra logic for that.
How about this portable nugget:
int safeStrlen(char *buf, int max)
{
int i;
for(i=0;buf[i] && i<max; i++){};
return i;
}
As Neil Butterworth already said in his answer above: C-Strings which are not terminated by a \0 character, are no C-Strings!
The only chance you do have is to write an immutable Adaptor or something which creates a valid copy of the C-String with a \0 terminating character. Of course, if the input is wrong and there is an C-String defined like:
char cstring[3] = {'1','2','3'};
will indeed result in unexpected behavior, because there can be something like 123#4x\0 in the memory now. So the result of of strlen() for example is now 6 and not 3 as expected.
The following approach shows how to create a safe C-String in any case:
char *createSafeCString(char cStringToCheck[]) {
//Cast size_t to integer
int size = static_cast<int>(strlen(cStringToCheck)) ;
//Initialize new array out of the stack of the method
char *pszCString = new char[size + 1];
//Copy data from one char array to the new
strncpy(pszCString, cStringToCheck, size);
//set last character to the \0 termination character
pszCString[size] = '\0';
return pszCString;
}
This ensures that if you manipulate the C-String to not write on the memory of something else.
But this is not what you wanted. I know, but there is no other way to achieve the length of a char array without termination. This isn't even an approach. It just ensures that even if the User (or Dev) is inserting ***** to work fine.