Get part of a char array - c++

I feel like this is a really silly question, but I can't seem to find an answer anywhere!
Is it possible to get a group of chars from a char array? to throw down some pseudo-code:
char arry[20] = "hello world!";
char part[10] = arry[0-4];
printf(part);
output:
hello
So, can I get a segment of chars from an array like this without looping and getting them char-by-char or converting to strings so I can use substr()?

You could use memcpy (or strncpy) to get a substring:
memcpy(part, arry + 5 /* Offset */, 3 /* Length */);
part[3] = 0; /* Add terminator */
On another aspect of your code, note that doing printf(str) can lead to format string vulnerabilities if str contains untrusted input.

In short, no. C-style "strings" simply don't work that way. You will either have to use a manual loop, or strncpy(), or do it via C++ std::string functionality. Given that you're in C++, you may as well do everything with C++ strings!
Side-note
As it happens, for your particular example application, you can achieve this simply via the functionality offered by printf():
printf("%.5s\n", arry);

As Oli said, you'd need to use C++ std::string functionality. In your example:
std::string hello("Hello World!");
std::string part(hello.substr(0, 5)); // note it's <start>, <length>, so not '0-4'
std::cout << part;

Well, you do mention the two obvious approaches. The only thing I can think of would be to define your own substring type to work off character arrays:
struct SubArray
{
SubArray(const char* a, unsigned s, unsigned e)
:arrayOwnedElseWhere_(a),
start_(s),
end_(e)
{}
const char* arrayOwnedElseWhere_;
unsigned start_;
unsigned end_;
void print()
{
printf_s("%.*s\n", end_ - start_ + 1, arrayOwnedElseWhere_ + start_);
}
};

Related

How to avoid providing length along with char*?

There is a function which sends data to the server:
int send(
_In_ SOCKET s,
_In_ const char *buf,
_In_ int len,
_In_ int flags
);
Providing length seems to me a little bit weird. I need to write a function, sending a line to the server and wrapping this one such that we don't have to provide length explicitly. I'm a Java-developer and in Java we could just invoke String::length() method, but now we're not in Java. How can I do that, unless providing length as a template parameter? For instance:
void sendLine(SOCKET s, const char *buf)
{
}
Is it possible to implement such a function?
Use std string:
void sendLine(SOCKET s, const std::string& buf) {
send (s, buf.c_str(), buf.size()+1, 0); //+1 will also transmit terminating \0.
}
On a side note: your wrapper function ignores the return value and doesn't take any flags.
you can retrieve the length of C-string by using strlen(const char*) function.
make sure all the strings are null terminated and keep in mind that null-termination (the length grows by 1)
Edit: My answer originally only mentioned std::string. I've now also added std::vector<char> to account for situations where send is not used for strictly textual data.
First of all, you absolutely need a C++ book. You are looking for either the std::string class or for std::vector<char>, both of which are fundamental elements of the language.
Your question is a bit like asking, in Java, how to avoid char[] because you never heard of java.lang.String, or how to avoid arrays in general because you never heard of java.util.ArrayList.
For the first part of this answer, let's assume you are dealing with just text output here, i.e. with output where a char is really meant to be a text character. That's the std::string use case.
Providing lenght seems to me a little bit wierd.
That's the way strings work in C. A C string is really a pointer to a memory location where characters are stored. Normally, C strings are null-terminated. This means that the last character stored for the string is '\0'. It means "the string stops here, and if you move further, you enter illegal territory".
Here is a C-style example:
#include <string.h>
#include <stdio.h>
void f(char const* s)
{
int l = strlen(s); // l = 3
printf(s); // prints "foo"
}
int main()
{
char* test = new char[4]; // avoid new[] in real programs
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
test[3] = '\0';
f(test);
delete[] test;
}
strlen just counts all characters at the specified position in memory until it finds '\0'. printf just writes all characters at the specified position in memory until it finds '\0'.
So far, so good. Now what happens if someone forgets about the null terminator?
char* test = new char[3]; // don't do this at home, please
test[0] = 'f';
test[1] = 'o';
test[2] = 'o';
f(test); // uh-oh, there is no null terminator...
The result will be undefined behaviour. strlen will keep looking for '\0'. So will printf. The functions will try to read memory they are not supposed to. The program is allowed to do anything, including crashing. The evil thing is that most likely, nothing will happen for a while because a '\0' just happens to be stored there in memory, until one day you are not so lucky anymore.
That's why C functions are sometimes made safer by requiring you to explicitly specify the number of characters. Your send is such a function. It works fine even without null-terminated strings.
So much for C strings. And now please don't use them in your C++ code. Use std::string. It is designed to be compatible with C functions by providing the c_str() member function, which returns a null-terminated char const * pointing to the contents of the string, and it of course has a size() member function to tell you the number of characters without the null-terminated character (e.g. for a std::string representing the word "foo", size() would be 3, not 4, and 3 is also what a C function like yours would probably expect, but you have to look at the documentation of the function to find out whether it needs the number of visible characters or number of elements in memory).
In fact, with std::string you can just forget about the whole null-termination business. Everything is nicely automated. std::string is exactly as easy and safe to use as java.lang.String.
Your sendLine should thus become:
void sendLine(SOCKET s, std::string const& line)
{
send(s, line.c_str(), line.size());
}
(Passing a std::string by const& is the normal way of passing big objects in C++. It's just for performance, but it's such a widely-used convention that your code would look strange if you just passed std::string.)
How can I do that, unless providing lenght as a template parameter?
This is a misunderstanding of how templates work. With a template, the length would have to be known at compile time. That's certainly not what you intended.
Now, for the second part of the answer, perhaps you aren't really dealing with text here. It's unlikely, as the name "sendLine" in your example sounds very much like text, but perhaps you are dealing with raw data, and a char in your output does not represent a text character but just a value to be interpreted as something completely different, such as the contents of an image file.
In that case, std::string is a poor choice. Your output could contain '\0' characters that do not have the meaning of "data ends here", but which are part of the normal contents. In other words, you don't really have strings anymore, you have a range of char elements in which '\0' has no special meaning.
For this situation, C++ offers the std::vector template, which you can use as std::vector<char>. It is also designed to be usable with C functions by providing a member function that returns a char pointer. Here's an example:
void sendLine(SOCKET s, std::vector<char> const& data)
{
send(s, &data[0], data.size());
}
(The unusual &data[0] syntax means "pointer to the first element of the encapsulated data. C++11 has nicer-to-read ways of doing this, but &data[0] also works in older versions of C++.)
Things to keep in mind:
std::string is like String in Java.
std::vector is like ArrayList in Java.
std::string is for a range of char with the meaning of text, std::vector<char> is for a range of char with the meaning of raw data.
std::string and std::vector are designed to work together with C APIs.
Do not use new[] in C++.
Understand the null termination of C strings.

Initialization of Chars [duplicate]

This question already has answers here:
How to initialize all members of an array to the same value?
(26 answers)
Closed 8 years ago.
I have been wondering, why can I not write my code like so:
char myChar[50];
myChar = "This is a really cool char!";
Or at least like this:
char myChar[50];
myChar[0] = "This is a really cool char!";
The second way makes more sense that it should work, to me, seeing that I would
start the array at the point I want it to start moving the letters into each spot in the
array.
Does anyone know why C++ does not do this? And can you show me the reasoning behind the
right way to do it?
Thank you all in advance!
The first line:
char myChar[50];
...allocates an array of 50 characters on the stack. The second line:
myChar = "This is a really cool char!";
Is attempting to assign a const static string (which exists in read-only memory in the text segment of your code) to the address of the beginning of the array. This is an incompatible LVALUE/RVALUE matcing/assignment. This approach:
const char* myChar = "This is a really cool char";
Will work, as the assignment of a pointer to address a string literal must be done at initialization time. There are potential exceptions, as in assigning a const char* pointer to a string literal like so:
/*******************************************************************************
* Preprocessor Directives
******************************************************************************/
#include <stdio.h>
/*******************************************************************************
* Function Prototypes
******************************************************************************/
const char* returnErrorString(int iError);
/*******************************************************************************
* Function Definitions
******************************************************************************/
int main(void) {
int i;
for (i=(-1); i<3; i++) {
printf("i=%d - Error String:%s\n", returnErrorString(i));
}
return 0;
}
const char* returnErrorString(int iError) {
const char* ret = NULL;
switch (iError) {
case 0:
ret = "No error";
break;
case (-1):
ret = "Invalid input";
break;
default:
ret = "Unknown error";
break;
}
return ret;
}
You might benefit from reading the post in my references below. It will give you some info on how code, variables, constants, etc, are broken into different segments of the final binary, and why some approaches don't even make sense. Also, it would be beneficial to read up a bit on terminology like integer literals, string literals, l-values, r-values, etc.
Good luck!
References
Difference between declared string and allocated string, Accessed 2014-05-01, <https://stackoverflow.com/questions/16021454/difference-between-declared-string-and-allocated-string>
You must initialise the array of chars inside the declaration of the array. There is actually no reason for not doing so, as if not, your array will contain garbage values until you initialise it. I advise you to look at this link:
Char array declaration and initialization in C
Also, you are allocating a char array of size 50 but only using 28 elements of it, this would appear to me to be a waste...
Try the following for simple string initialisations:
char mychar[11] = "hello world";
Or...
char *mychar = "hello world";
I hope this helps...
If you want to think of this in holistic terms, the reason is because myChar isn't a string -- it's just an array of char. Hence "FooGHBar" and char [50] are completely different types. Related in a sense, but really not.
Now some might say, "but "FooBar" is a string, and char [50] is really just a string too." But that is going on the assumption that myChar is the same as "FooBar", and it's not. It's also assuming that the compiler understands that both char[50] and char* are pointers to strings. The compiler doesn't understand that. There could be any manner of thing stored in those places that have nothing to do with strings.
"But myChar is just a pointer?"
That is the reason why people think that the assignment should be a natural thing -- but the fundamental premise is wrong. myChar is not a pointer. It is an array. A name which refers to an array will decay into a pointer at the drop of a hat, but an array is not a pointer.

char* to char[]

I have char* which is of fixed (known) width but is not null terminated.
I want to pass it into LOG4CPLUS_ERROR("Bad string " << char_pointer); but as its not null terminated it will print it all.
Any suggestions of some light weight way of getting "(char[length])*char_pointer" without performing a copy?
No, you'll have to deep-copy and null-terminate it. That code expects a null-terminated string and it means a contiguous block of characters ending with a null terminator.
If your goal is to print such a string, you could:
Store the last byte.
Replace it with \0.
Print the string.
Print the stored byte.
Put the stored byte back into the last position of the string.
Wrap all this in a function.
Real iostreams
When you're writing to a real iostream, then you can just use ostream.write() which takes a char* and a size for how many bytes to write -- no null termination necessary. (In fact, any null characters in the string would be written to the ostream, and would not be used to determine the size.)
Logging libraries
In some logging libraries, the stream that you write to is not a real iostream. This is the case in Log4CPP.
However, in Log4CPlus which is what it appears matt is using, the object that you're writing to is a std::basic_ostringstream<tchar> (see loggingmacros.h and streams.h for the definition, since none of this is obvious from the documentation). There's just one problem: in the macro LOG4CPLUS_ERROR, the first << is already built into the macro, so he won't be able to call LOG4CPLUS_ERROR(.write(char_pointer,length)) or anything like that. Unfortunately, I don't see an easy way around this without deconstructing the LOG4CPLUS_ERROR error macro and getting into the internals of Log4CPlus
Solution
I'm not sure why you're trying to avoid copying the string at this point, since you can see that there's a lot of copying going on inside the logging library. Any attempt to avoid that extra string copy is probably unwarranted optimization.
I'm going to assume that it's an issue of code cleanliness, and maybe an issue of making sure the copy happens inside the LOG4CPLUS_ERROR macro, as opposed to outside it. In that case, just use:
LOG4CPLUS_ERROR("Bad string " << std::string(char_pointer, length));
We're getting hung up on the semantics of conversion between char* and char[]. Take a step back, what are you trying to do? If this is a simple case of on an error condition, streaming out the content of a structure to a stream, why not do it properly?
e.g.
struct foo
{
char a1[10];
char a2[10];
char a3[10];
char a4[10];
};
// free function to stream the above structure properly..
std::ostream operator<<(std::ostream& str, foo const& st)
{
str << "foo::a1[";
std::copy(st.a1, st.a1 + sizeof(st.a1), std::ostream_iterator<char>(str));
str << "]\n";
str << "foo::a2[";
std::copy(st.a2, st.a2 + sizeof(st.a2), std::ostream_iterator<char>(str));
str << "]\n";
:
return str;
}
Now you can simply stream out an instance of foo and don't have to worry about null terminated string etc.!
I keep a string reference class in my toolkit just for these type of situations. Here is a greatly abbreviated version of that class. I trimmed away anything that is not relevant to this particular problem:
#include <iostream>
class stringref {
public:
stringref(const char* ptr, unsigned len) : ptr(ptr), len(len) {}
unsigned length() { return len; }
const char* data() { return ptr; }
private:
const char* ptr;
unsigned len;
};
std::ostream& operator<< (std::ostream& os, stringref sr) {
const char* data = sr.data();
for (unsigned len = sr.length(); len--; )
os << *data++;
return os;
}
using namespace std;
int main (int argc, const char * argv[])
{
cout << "string: " << stringref("test", 4) << endl;
}
or, in your case:
LOG4CPLUS_ERROR("Bad string " << stringref(char_pointer, length));
should work.
The idea of a string reference class is to keep enough information about a string (a size and a pointer) to refer to any block of memory which represents a string. It relies on you to make sure that the underlying string data is valid throughout the lifetime of a stringref object. This way you can pass around and process string information with a minimum of overhead.
When you know it is of fixed length: Why not simply add one more charakter to the size of the array? Then you can easily fill this last char with \0 terminating character and all will be fine
No, you'll have to copy it. There is no proper conversion in the language that you can use to get the array type out of it.
It seems very odd that you want to do this, or that you have a non-terminated C-style string in the first place.
Why are you not using std::string?

implementation Strcat Function

I've got a programming question about the implementation of strcat() function.
I have been trying to solve that problem and I got some Access violation error.
My created function:
char str_cat(char str1, char str2)
{
return str1-'\0'+str2;
}
what is wrong in the above code?
One more question please,
is "iostream" a header file? where can I get it?
thanks
Unfortunately, everything is wrong with this function, even the return type and argument types. It should look like
char * strcat(const char *str1, const char *str2)
and it should work by allocating a new block of memory large enough to hold the concatenated strings using either malloc (for C) or new (for C++), then copy both strings into it. I think you've got your (home)work cut out for you, though, as I don't think you know much of what any of that means.
Nothing is right in the above code.
You need to take char * parameters
You need to return a char * if you have to return something (which isn't needed)
You'll need to loop over the string copying individual characters - no easy solution with + and -
You'll need to 0-terminate the result
E.g. like this:
void strcat(char * Dest, char const * Src) {
char * d = Dest;
while (*d++);
char const * s = Src;
while (*s) { *d++ = *s++; }
*d = 0;
}
Why do you need to do this? There's a perfectly good strcat in the standard library, and there's a perfectly good std::string which you can use + on.
Don't want to sound negative but there is not much right with this code.
Firstly, strings in C are char*, not char.
Second, there is no way to 'add' or 'subtract' them the way you would hope (which is sort of kind of possible in, say, python).
iostream is the standard I/O header for C++, it should be bundled with your distribution.
I would really suggest a tutorial on pointers to get you going - this I found just by googling "ansi c pointers" - I'm guessing the problem asks you for a C answer as opposed to C++, since in C++ you would use std::string and the overloaded operator+.

Is there any safe strcmp?

I made a function like this:
bool IsSameString(char* p1, char* p2)
{
return 0 == strcmp(p1, p2);
}
The problem is that sometimes, by mistake, arguments are passed which are not strings (meaning that p1 or p2 is not terminated with a null character).
Then, strcmp continues comparing until it reaches non-accessible memory and crashes.
Is there a safe version of strcmp? Or can I tell whether p1 (and p2) is a string or not in a safe manner?
No, there's no (standard) way to tell whether a char * actually points to valid memory.
In your situation, it is better to use std::string rather than char *s for all your strings, along with the overloaded == operator. If you do this, the compiler would enforce type safety.
EDIT: As per the comments below if you find yourself in a situation where you sometimes pass char *s that may or may not be valid strings to functions that expect null-terminated strings then something is fundamentally wrong with your approach, so basically
#janm's answer below.
In some cases std::strncmp can solve your problem:
int strncmp ( const char * str1, const char * str2, size_t num );
It compares up to num characters of the C string str1 to those of the C string str2.
Also, take a look, what the US DHS National Cyber Security Division recommends on this matter:
Ensure that strings are null terminated before passing into strcmp. This can be enforced by always placing a \0 in the last allocated byte of the buffer.
char str1[] ="something";
char str2[] = "another thing";
/* In this case we know strings are null terminated. Pretend we don't. */
str1[sizeof(str1)-1] = '\0';
str2[sizeof(str2)-1] = '\0';
/* Now the following is safe. */
if (strcmp(str1, str2)) { /* do something */ } else { /* do something else */ }
If you are passing strings to strcmp() that are not null terminated you have already lost. The fact that you have a string that is not null terminated (but should be) indicates that you have deeper issues in your code. You cannot change strcmp() to safely deal with this problem.
You should be writing your code so that can never happen. Start by using the string class. At the boundaries where you take data into your code you need to make sure you deal with the exceptional cases; if you get too much data you need to Do The Right Thing. That does not involve running off the end of your buffer. If you must perform I/O into a C style buffer, use functions where you specify the length of the buffer and detect and deal with cases where the buffer is not large enough at that point.
There's no cure for this that is portable. The convention states that there's an extra character holding a null character that belongs to the same correctly allocated block of memory as the string itself. Either this convention is followed and everything's fine or undefined behaviour occurs.
If you know the length of the string you compare against you can use strncmp() but his will not help if the string passed to your code is actually shorter than the string you compare against.
you can use strncmp, But if possible use std::string to avoid many problems :)
You can put an upper limit on the number of characters to be compared using the strncmp function.
There is no best answer to this as you can't verify the char* is a string. The only solution is to create a type and use it for string for example str::string or create your own if you want something lighter. ie
struct MyString
{
MyString() : str(0), len(0) {}
MyString( char* x ) { len = strlen(x); str = strdup(x); }
⁓MyString() { if(str) free(str); }
char* str;
size_t len;
};
bool IsSameString(MyString& p1, MyString& p2)
{
return 0 == strcmp(p1.str, p2.str);
}
MyString str1("test");
MyString str2("test");
if( IsSameString( str1, str2 ) {}
You dont write, what platform you are using. Windows has the following functions:
IsBadStringPtr
IsBadReadPtr
IsBadWritePtr
IsBadStringPtr might be what you are looking for, if you are using windows.