How to work with null pointers in a std::vector - c++

Say I have a vector of null terminates strings some of which may be null pointers. I don't know even if this is legal. It is a learning exercise. Example code
std::vector<char*> c_strings1;
char* p1 = "Stack Over Flow";
c_strings1.push_back(p1);
p1 = NULL; // I am puzzled you can do this and what exactly is stored at this memory location
c_strings1.push_back(p1);
p1 = "Answer";
c_strings1.push_back(p1);
for(std::vector<char*>::size_type i = 0; i < c_strings1.size(); ++i)
{
if( c_strings1[i] != 0 )
{
cout << c_strings1[i] << endl;
}
}
Note that the size of vector is 3 even though I have a NULL at location c_strings1[1]
Question. How can you re-write this code using std::vector<char>
What exactly is stored in the vector when you push a null value?
EDIT
The first part of my question has been thoroughly answered but not the second. Not to my statisfaction at least. I do want to see usage of vector<char>; not some nested variant or std::vector<std::string> Those are familiar. So here is what I tried ( hint: it does not work)
std::vector<char> c_strings2;
string s = "Stack Over Flow";
c_strings2.insert(c_strings2.end(), s.begin(), s.end() );
// char* p = NULL;
s = ""; // this is not really NULL, But would want a NULL here
c_strings2.insert(c_strings2.end(), s.begin(), s.end() );
s = "Answer";
c_strings2.insert(c_strings2.end(), s.begin(), s.end() );
const char *cs = &c_strings2[0];
while (cs <= &c_strings2[2])
{
std::cout << cs << "\n";
cs += std::strlen(cs) + 1;
}

You don't have a vector of strings -- you have a vector of pointer-to-char. NULL is a perfectly valid pointer-to-char which happens to not point to anything, so it is stored in the vector.
Note that the pointers you are actually storing are pointers to char literals. The strings are not copied.
It doesn't make a lot of sense to mix the C++ style vector with the C-style char pointers. Its not illegal to do so, but mixing paradigms like this often results in confused & busted code.
Instead of using a vector<char*> or a vector<char>, why not use a vector<string> ?
EDIT
Based on your edit, it seems like what your'e trying to do is flatten several strings in to a single vector<char>, with a NULL-terminator between each of the flattened strings.
Here's a simple way to accomplish this:
#include <algorithm>
#include <vector>
#include <string>
#include <iterator>
using namespace std;
int main()
{
// create a vector of strings...
typedef vector<string> Strings;
Strings c_strings;
c_strings.push_back("Stack Over Flow");
c_strings.push_back("");
c_strings.push_back("Answer");
/* Flatten the strings in to a vector of char, with
a NULL terminator between each string
So the vector will end up looking like this:
S t a c k _ O v e r _ F l o w \0 \0 A n s w e r \0
***********************************************************/
vector<char> chars;
for( Strings::const_iterator s = c_strings.begin(); s != c_strings.end(); ++s )
{
// append this string to the vector<char>
copy( s->begin(), s->end(), back_inserter(chars) );
// append a null-terminator
chars.push_back('\0');
}
}

So,
char *p1 = "Stack Over Flow";
char *p2 = NULL;
char *p3 = "Answer";
If you notice, the type of all three of those is exactly the same. They are all char *. Because of this, we would expect them all to have the same size in memory as well.
You may think that it doesn't make sense for them to have the same size in memory, because p3 is shorter than p1. What actually happens, is that the compiler, at compile-time, will find all of the strings in the program. In this case, it would find "Stack Over Flow" and "Answer". It will throw those to some constant place in memory, that it knows about. Then, when you attempt to say that p3 = "Answer", the compiler actually transforms that to something like p3 = 0x123456A0.
Therefore, with either version of the push_back call, you are only pushing into the vector a pointer, not the actual string itself.
The vector itself, doesn't know, or care that a NULL char * is an empty string. So in it's counting, it sees that you have pushed three pointers into it, so it reports a size of 3.

I have a funny feeling that what you would really want is to have the vector contain something like "Stack Over Flow Answer" (possibly without space before "Answer").
In this case, you can use a std::vector<char>, you just have to push the whole arrays, not just pointers to them.
This cannot be accomplished with push_back, however vector have an insert method that accept ranges.
/// Maintain the invariant that the vector shall be null terminated
/// p shall be either null or point to a null terminated string
void push_back(std::vector<char>& v, char const* p) {
if (p) {
v.insert(v.end(), p, p + strlen(p));
}
v.push_back('\0');
} // push_back
int main() {
std::vector<char> v;
push_back(v, "Stack Over Flow");
push_back(v, 0);
push_back(v, "Answer");
for (size_t i = 0, max = v.size(); i < max; i += strlen(&v[i]) + 1) {
std::cout << &v[i] << "\n";
}
}
This uses a single contiguous buffer to store multiple null-terminated strings. Passing a null string to push_back results in an empty string being displayed.

What exactly is stored in the vector when you push a null value?
A NULL. You're storing pointers, and NULL is a possible value for a pointer. Why is this unexpected in any way?
Also, use std::string as the value type (i.e. std::vector<std::string>), char* shouldn't be used unless it's needed for C interop. To replicate your code using std::vector<char>, you'd need std::vector<std::vector<char>>.

You have to be careful when storing pointers in STL containers - copying the containers results in shallow copy and things like that.
With regard to your specific question, the vector will store a pointer of type char* regardless of whether or not that pointer points to something. It's entirely possible you would want to store a null-pointer of type char* within that vector for some reason - for example, what if you decide to delete that character string at a later point from the vector? Vectors only support amortized constant time for push_back and pop_back, so there's a good chance if you were deleting a string inside that vector (but not at the end) that you would prefer to just set it null quickly and save some time.
Moving on - I would suggest making a std::vector > if you want a dynamic array of strings which looks like what you're going for.
A std::vector as you mentioned would be useless compared to your original code because your original code stores a dynamic array of strings and a std::vector would only hold one dynamically changable string (as a string is an array of characters essentially).

NULL is just 0. A pointer with value 0 has a meaning. But a char with value 0 has a different meaning. It is used as a delimiter to show the end of a string. Therefore, if you use std::vector<char> and push_back 0, the vector will contain a character with value 0. vector<char> is a vector of characters, while std::vector<char*> is a vector of C-style strings -- very different things.
Update. As the OP wants, I am giving an idea of how to store (in a vector) null terminated strings some of which are nulls.
Option 1: Suppose we have vector<char> c_strings;. Then, we define a function to store a string pi. A lot of complexity is introduced since we need to distinguish between an empty string and a null char*. We select a delimiting character that does not occur in our usage. Suppose this is the '~' character.
char delimiter = '~';
// push each character in pi into c_strings
void push_into_vec(vector<char>& c_strings, char* pi) {
if(pi != 0) {
for(char* p=pi; *p!='\0'; p++)
c_strings.push_back(*p);
// also add a NUL character to denote end-of-string
c_strings.push_back('\0');
}
c_strings.push_back(deimiter);
// Note that a NULL pointer would be stored as a single '~' character
// while an empty string would be stored as '\0~'.
}
// now a method to retrieve each of the stored strings.
vector<char*> get_stored_strings(const vector<char>& c_strings) {
vector<char*> r;
char* end = &c_strings[0] + c_strings.size();
char* current = 0;
bool nullstring = true;
for(char* c = current = &c_strings[0]; c != end+1; c++) {
if(*c == '\0') {
int size = c - current - 1;
char* nc = new char[size+1];
strncpy(nc, current, size);
r.push_back(nc);
nullstring = false;
}
if(*c == delimiter) {
if(nullstring) r.push_back(0);
nullstring = true; // reset nullstring for the next string
current = c+1; // set the next string
}
}
return r;
}
You still need to call delete[] on the memory allocated by new[] above. All this complexity is taken care of by using the string class. I very rarely use char* in C++.
Option 2: You could use vector<boost::optional<char> > . Then the '~' can be replaced by an empty boost::optional, but other other parts are the same as option 1. But the memory usage in this case would be higher.

Related

How to convert a std::string which contains '\0' to a char* array?

I have a string like,
string str="aaa\0bbb";
and I want to copy the value of this string to a char* variable. I tried the following methods but none of them worked.
char *c=new char[7];
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
strcpy(c,str.data()); // c="aaa"
str.copy(c,7); // c="aaa"
How can I copy that string to a char* variable without loosing any data?.
You can do it the following way
#include <iostream>
#include <string>
#include <cstring>
int main()
{
std::string s( "aaa\0bbb", 7 );
char *p = new char[s.size() + 1];
std::memcpy( p, s.c_str(), s.size() );
p[s.size()] = '\0';
size_t n = std::strlen( p );
std::cout << p << std::endl;
std::cout << p + n + 1 << std::endl;
}
The program output is
aaa
bbb
You need to keep somewhere in the program the allocated memory size for the character array equal to s.size() + 1.
If there is no need to keep the "second part" of the object as a string then you may allocate memory of the size s.size() and not append it with the terminating zero.
In fact these methods used by you
memcpy(c,&str[0],7); // c="aaa"
memcpy(c,str.data(),7); // c="aaa"
str.copy(c,7); // c="aaa"
are correct. They copy exactly 7 characters provided that you are not going to append the resulted array with the terminating zero. The problem is that you are trying to output the resulted character array as a string and the used operators output only the characters before the embedded zero character.
Your string consists of 3 characters. You may try to use
using namespace std::literals;
string str="aaa\0bbb"s;
to create string with \0 inside, it will consist of 7 characters
It's still won't help if you will use it as c-string ((const) char*). c-strings can't contain zero character.
There are two things to consider: (1) make sure that str already contains the complete literal (the constructor taking only a char* parameter might truncate at the string terminator char). (2) Provided that str actually contains the complete literal, statement memcpy(c,str.data(),7) should work. The only thing then is how you "view" the result, because if you pass c to printf or cout, then they will stop printing once the first string terminating character is reached.
So: To make sure that your string literal "aaa\0bbb" gets completely copied into str, use std::string str("aaa\0bbb",7); Then, try to print the contents of c in a loop, for example:
std::string str("aaa\0bbb",7);
const char *c = str.data();
for (int i=0; i<7; i++) {
printf("%c", c[i] ? c[i] : '0');
}
You already did (not really, see edit below). The problem however, is that whatever you are using to print the string (printf?), is using the c string convention of ending strings with a '\0'. So it starts reading your data, but when it gets to the 0 it will assume it is done (because it has no other way).
If you want to simply write the buffer to the output, you will have to do this with something like
write(stdout, c, 7);
Now write has information about where the data ends, so it can write all of it.
Note however that your terminal cannot really show a \0 character, so it might show some weird symbol or nothing at all. If you are on linux you can pipe into hexdump to see what the binary output is.
EDIT:
Just realized, that your string also initalizes from const char* by reading until the zero. So you will also have to use a constructor to tell it to read past the zero:
std::string("data\0afterzero", 14);
(there are prettier solutions probably)

C++ tolower/toupper char pointer

Do you guys know why the following code crash during the runtime?
char* word;
word = new char[20];
word = "HeLlo";
for (auto it = word; it != NULL; it++){
*it = (char) tolower(*it);
I'm trying to lowercase a char* (string). I'm using visual studio.
Thanks
You cannot compare it to NULL. Instead you should be comparing *it to '\0'. Or better yet, use std::string and never worry about it :-)
In summary, when looping over a C-style string. You should be looping until the character you see is a '\0'. The iterator itself will never be NULL, since it is simply pointing a place in the string. The fact that the iterator has a type which can be compared to NULL is an implementation detail that you shouldn't touch directly.
Additionally, you are trying to write to a string literal. Which is a no-no :-).
EDIT:
As noted by #Cheers and hth. - Alf, tolower can break if given negative values. So sadly, we need to add a cast to make sure this won't break if you feed it Latin-1 encoded data or similar.
This should work:
char word[] = "HeLlo";
for (auto it = word; *it != '\0'; ++it) {
*it = tolower(static_cast<unsigned char>(*it));
}
You're setting word to point to the string literal, but literals are read-only, so this results in undefined behavior when you assign to *it. You need to make a copy of it in the dynamically-allocated memory.
char *word = new char[20];
strcpy(word, "HeLlo");
Also in your loop you should compare *it != '\0'. The end of a string is indicated by the character being the null byte, not the pointer being null.
Given code (as I'm writing this):
char* word;
word = new char[20];
word = "HeLlo";
for (auto it = word; it != NULL; it++){
*it = (char) tolower(*it);
This code has Undefined Behavior in 2 distinct ways, and would have UB also in a third way if only the text data was slightly different:
Buffer overrun.
The continuation condition it != NULL will not be false until the pointer it has wrapped around at the end of the address range, if it does.
Modifying read only memory.
The pointer word is set to point to the first char of a string literal, and then the loop iterates over that string and assigns to each char.
Passing possible negative value to tolower.
The char classification functions require a non-negative argument, or else the special value EOF. This works fine with the string "HeLlo" under an assumption of ASCII or unsigned char type. But in general, e.g. with the string "Blåbærsyltetøy", directly passing each char value to tolower will result in negative values being passed; a correct invocation with ch of type char is (char) tolower( (unsigned char)ch ).
Additionally the code has a memory leak, by allocating some memory with new and then just forgetting about it.
A correct way to code the apparent intent:
using Byte = unsigned char;
auto to_lower( char const c )
-> char
{ return Byte( tolower( Byte( c ) ) ); }
// ...
string word = "Hello";
for( char& ch : word ) { ch = to_lower( ch ); }
There are already two nice answers on how to solve your issues using null terminated c-strings and poitners. For the sake of completeness, I propose you an approach using c++ strings:
string word; // instead of char*
//word = new char[20]; // no longuer needed: strings take care for themseves
word = "HeLlo"; // no worry about deallocating previous values: strings take care for themselves
for (auto &it : word) // use of range for, to iterate through all the string elements
it = (char) tolower(it);
Its crashing because you are modifying a string literal.
there is a dedicated functions for this
use
strupr for making string uppercase and strlwr for making the string lower case.
here is an usage example:
char str[ ] = "make me upper";
printf("%s\n",strupr(str));
char str[ ] = "make me lower";
printf("%s\n",strlwr (str));

Reverse a string with pointers [duplicate]

This question already has answers here:
C++ Reverse Array
(5 answers)
Closed 7 years ago.
This is an amateur question. I searched for other posts about this topic, found lots of results, but am yet to understand the concepts behind the solution.
This is a practice problem in my C++ book. It is not assigned homework. [Instructions here][1] .
WHAT I WOULD LIKE TO DO:
string input;
getline(cin, input); //Get the user's input.
int front = 0;
int rear;
rear = input.size();
WHAT THE PROBLEM WANTS ME TO DO
string input;
getline(cin, input); //Get the user's input.
int* front = 0;
int* rear;
rear = input.size();
Error: a value of type "size_t" cannot be assigned to an entity of type int*
This makes sense to me, as you cannot assign an 'address' of an int to the value of an int.
So my questions are:
What is the correct way to go about this? Should I just forget about initializing front* or rear* to ints? Just avoid that all together? If so, what would be the syntax of that solution?
Why would this problem want me to use pointers like this? It's clear this is a horrible usage of pointers. Without pointers I could complete this problem in like 30 seconds. It's just really frustrating.
I don't really see an advantage to EVER using pointers aside from doing something like returning an array by using pointers.
Thanks guys. I know you like to help users that help themselves so I did some research about this first. I'm just really irritated with the concept of pointers right now vs. just using the actual variable itself.
Posts about this topic that I've previously read:
[Example 1][2]
[Example 2][3]
[Example 3][4]
[1]: http://i.imgur.com/wlufckg.png "Instructions"
[2]: How does reversing a string with pointers works "Post 1"
[3]: Reverse string with pointers? "Post 2"
[4]: Reverse char string with pointers "Post 3"
string.size() does not return a pointer - it returns size_t.
To revert a string try this instead:
string original = "someText"; // The original string
string reversed = original; // This to make sure that the reversed string has same size as the original string
size_t x = original.size(); // Get the size of the original string
for (size_t i = 0; i < x; i++) // Loop to copy from end of original to start of reversed
{
reversed[i]=original[x-1-i];
}
If you really (for some strange reason) needs pointers try this:
string input;
getline(cin, input); //Get the user's input.
char* front = &input[0];
char* rear = &input[input.size()-1];
but I would not use pointers into a string. No need for it.
I guest you may not quite understand the problem here. This problem want you to COPY a C string then REVERSE it by pointer operation. There is no classes in standard C. So, the C string is quite different from string class in C++. It is actually an array of char-type elements ended with character '\0'.
After understand this, you may start to understand the problem here. If you want to copy a C string, you can not just use str_a = str_b. You need constructor here. However, in pure C style, you should REQUIRE memory space for the string at first (you can use malloc here), then copy each element. For example, you want to create a function to make a copy of input string,
#include <string.h>
char *strcopy(char* str_in) {
int len = strlen(str_in);
char *str_out = (char*)malloc(len+1);
char *in = str_in;
char *out = str_out;
while(*in != '\0') { *out++ = *in++; }
return str_out;
}
As you see, we actually use char* not int* here to operate string element. You should distinguish the pointer (such as in) and the element pointed by the pointer (such as *in) at first.
I'll show you a solution in pure C style for your problem, I hope this would help you to understand it. (You should be able to compile it without modification)
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
char* strreverse(char* in){
// length of input string
int len = strlen(in);
// allocate memory for string operation
char *out = (char*)malloc(len+1);
// initialize <front> and <end>
char *front = out, *end = out + len - 1;
char buffer;
// copy input string
for(int i = 0; i <= len; i++){ out[i] = in[i]; }
// reverse string
for(; front < end; front++, end--) {
buffer = *front;
*front = *end;
*end = buffer;
}
return out;
}
int main() {
printf("REVERSE >> %s\n", strreverse("Hello, World!"));
return 0;
}
This is not you would do by C++ in actual programming, however, I guess the problem here is trying to let you understand mechanism of pointers. In this aspect, original C style would help a lot.

how to check const char* values one by one

I have a const char* variable which takes values from a function that I have wrote.
When I write this variable to a file many times it writes nothing. So it must be empty or filled in with space.The strange thing is that in the txt file that I write it changes line every time, when it has value or not.Why is that?Does it mean that the returned value from the function has a \n?
how can I check if a value of a const char * is empty or in general how can I check character by character the value in char*?
Since C/C++ pointers can be interpreted as arrays of values the pointers point to, the two ways of checking values of a char* is by applying an indexing operator or by using pointer arithmetics. You can do this:
const char *p = myFunctionReturningConstChar();
for (int i = 0 ; p[i] ; i++) {
if (p[i] == '\n') printf("New line\n");
}
or this:
const char *p = myFunctionReturningConstChar();
while (*p) {
if (*p == '\n') printf("New line\n");
p++;
}
In addition, C++ library provides multiple functions for working with C strings. You may find strlen helpful to check if your pointer points to an empty string.

Array initialization issue

I need an empty char array, but when i try do thing like this:
char *c;
c = new char [m];
int i;
for (i = 0; i < m; i++)
c[i] = 65 + i;
and then I print c. can see that c = 0x00384900 "НННННННээээ««««««««юоюою"
after cycle it becomes: 0x00384900 "ABCDEFGээээ««««««««юоюою"
How can I solve this problem? Or maybe there is way with string?
If you're trying to create a string, you need to make sure that the character sequence is terminated with the null character \0.
In other words:
char *c;
c = new char [m+1];
int i;
for (i = 0; i < m; i++)
c[i] = 65 + i;
c[m] = '\0';
Without it, functions on strings like printf won't know where the string ends.
printf("%s\n",c); // should work now
If you create a heap array, OS will not initialiase it.
To do so you hvae these options:
Allocate an array statically or globally. The array will be filled with zeroes automatically.
Use ::memset( c, 0, m ); on heap-initialised or stack array to fill it with zeroes.
Use high-level types like std::string.
I believe that's your debugger trying to interpret the string. When using a char array to represent a string in C or C++, you need to include a null byte at the end of the string. So, if you allocate m + 1 characters for c, and then set c[m] = '\0', your debugger should give you the value you are expecting.
If you want a dynamically-allocated string, then the best option is to use the string class from the standard library:
#include <string>
std::string s;
for (i = 0; i < m; i++)
s.push_back(65 + i);
C strings are null terminated. That means that the last character must be a null character ('\0' or just 0).
The functions that manipulate your string use the characters between the beginning of the array (that you passed as parameter, first position in the array) and a null value. If there is no null character in your array the function will iterate pass it's memory until it finds one (memory leak). That's why you got some garbage printed in your example.
When you see a literal constant in your code, like printf("Hello");, it is translate into an array of char of length 6 ('H', 'e', 'l', 'l', 'o' and '\0');
Of course, to avoid such complexity you can use std::string.