I've been reading a book for self study (http://www.amazon.com/gp/product/0321992784) and I'm on chapter 17 doing the exercises. One of them I solved, but I'm not satisfied and would like some help. Thank you in advanced.
The Exercise: Write a program that reads characters from cin into an array that you allocate on the free store. Read indvidual characters until an exclamation mark(!) is entered. Do not use std::string. Do not worry about memory exhaustion.
What I did:
char* append(const char* str, char ch); // Add a character to the string and return a duplicate
char* loadCstr(); // Read characters from cin into an array of characters
int main()
{
char* str{ loadCstr() };
std::cout << str << '\n';
return 0;
}
I made 2 functions, 1 to create a new string with a size 1 larger than the old and add a character at the end.
char* append(const char* str, char ch)
/*
Create a new string with a size 1 greater than the old
insert old string into new
add character into new string
*/
{
char* newstr{ nullptr };
int i{ 0 };
if (str)
newstr = new char [ sizeof(str) + 2 ];
else
newstr = new char [ 2 ];
if(str)
while (str [ i ] != '\0')
newstr [ i ] = str [ i++ ]; // Put character into new string, then increment the index
newstr [ i++ ] = ch; // Add character and increment the index
newstr [ i ] = '\0'; // Trailing 0
return newstr;
}
This is the function for the exercise using the append function I created, It works, but from what I understand each time I call append, there is a memory leak because I create a new character array and didn't delete the old.
char* loadCstr()
/*
get a character from cin, append it to str until !
*/
{
char* str{ nullptr };
for (char ch; std::cin >> ch && ch != '!';)
str = append(str, ch);
return str;
}
I tried adding another pointer to hold the old array and delete it after making a new one, but after about 6 calls in this loop I get a runtime error that I think tells me I'm deleting something I shouldn't? which is where I got confused.
This is the old one that doesn't work beyond 6 characters:
char* loadCstr()
/*
get a character from cin, append it to str until !
*/
{
char* str{ nullptr };
for (char ch; std::cin >> ch && ch != '!';) {
char* temp{ append(str, ch) };
if (str)
delete str;
str = temp;
}
return str;
}
So I want to know how I can fix this function so there are no memory leaks. Thank you again. (Also please note, I do know these functions already exist and using std::string handles all the free store stuff for me, I just want to understand it and this is a learning exercise.)
You have to use standard C function std::strlen instead of the sizeof operator because in case of your function the sizeof operator returns the size of pointer instead of the length of the string.
Also you need to delete already allocated array.
The function can look the following way
char* append(const char* str, char ch)
/*
Create a new string with a size 1 greater than the old
insert old string into new
add character into new string
*/
{
size_t n = 0;
if ( str ) n = std::strlen( str );
char *newstr = new char[ n + 2 ];
for ( size_t i = 0; i < n; i++ ) newstr[i] = str[i];
delete [] str;
newstr[n] = ch;
newstr[n+1] = '\0';
return newstr;
}
And in the function loadCstr it can be called like
str = append( str, ch );
Also instead of the loop to copy the string you could use standard algorithm std::copy
Is the point to learn about memory management, or about how string operations work internally?
For the second (learning about string operations), you should use std::unique_ptr<char[]> which will automatically free the attached array when the pointer dies. You'll still need to calculate string length, copy between strings, append -- all the things you are doing now. But std::unique_ptr<char[]> will handle the deallocation.
For the first case, you're better off writing an RAII class (custom version of std::unique_ptr<T>) and learning how to free memory in a destructor, than scattering delete [] statements all over your code. Writing delete [] everywhere is actually a bad habit, learning it will move your ability to program C++ backwards.
Related
I'm doing an exercise in which I have to copy a c-style string into memory allocated on free store. I am required to do it without using subscripting and relying solely on pointer arithmetic. I wrote the following function-
char* str_dup(const char* s)
{
// count no. of elements
int i = 0;
const char* q = s;
while (*q) { ++i; ++q; }
//create an array +1 for terminating 0
char* scpy = new char[i + 1];
//copy elements to new array
while (*s)
{
*scpy = *s;
++s;
++scpy;
}
*scpy = 0;
return scpy;
}
The function is returning random characters. But if I change it into this-
char* str_dup(const char* s)
{
// count no. of elements
int i = 0;
const char* q = s;
while (*q) { ++i; ++q; }
//create an array +1 for terminating 0
char* scpyx = new char[i + 1];
char* scpy = scpyx;
//copy elements to new array
while (*s)
{
*scpy = *s;
++s;
++scpy;
}
*scpy = 0;
return scpyx;
}
it works. Can someone explain me why first code is not working and second is working?
The first code is not working since you return the final value of scpy, which at that point points at the terminating NUL character, and not the start of the string.
One solution is to do as you did, and save a copy of the original pointer to have something to return.
You should really use strlen() and memcpy(), they make this easier but perhaps they're off-limits to you.
I have built my own functions of strlen and strdup.
When i use my strdup in the first time it's okay, i close the window, run it again, then in the end of the program after the return 0 from the main the program crashes. VS just says that it triggered a breakpoint.
#include "stdafx.h"
#include <iostream>
using namespace std;
int MyStrlen(const char* str);
char* MyStrdup(const char* str);
int main()
{
char *s1 = "Hello World!";
char *s2 = MyStrdup(s1);
cout << s1 << " , " << s2 << endl;
system("pause");
return 0;
}
int MyStrlen(const char* str)
{
register int iLength = 0;
while (str[iLength] != NULL)
{
iLength++;
}
return iLength;
}
char* MyStrdup(const char* str)
{
char* newStr;
int strLength = MyStrlen(str);
newStr = new char(strLength+1);
for (register int i = 0; i < strLength; i++)
{
newStr[i] = str[i];
}
newStr[strLength] = NULL;
return newStr;
}
Can someone note the place that makes it crash? I think it's a memory leak maybe.
Also, can you note things to improve in the code? For my learning purpose
EDIT: Thanks, I don't know why I used () instead of [] to define my new char[]. That was a memory leak or overwrite after all.
The "new" statement for an array should be with square brackets:
newStr = new char[strLength+1];
When you do
new char(c)
It allocates a single character and copies the character c into it.
When you do
new char[n]
it allocates memory for n characters
The expression new char(strLength+1) allocates a single character, and initializes it to strLength + 1. That of course means you will write out of bounds and have undefined behavior when you copy the string.
You should use new char[strLength + 1] instead, to allocate an "array" of characters.
On an unrelated note, while the terminating character in a string is commonly called the null character, it's not actually a null pointer (which is what NULL is for). Not that it really matters since in C++ NULL is a macro that expands to 0, but you should probably be explicit and use '\0' anyway (it gives more context for future readers).
I've tried so may ways on the Internet to append a character to a char* but none of them seems to work. Here is one of my incomplete solution:
char* appendCharToCharArray(char * array, char a)
{
char* ret = "";
if (array!="")
{
char * ret = new char[strlen(array) + 1 + 1]; // + 1 char + 1 for null;
strcpy(ret,array);
}
else
{
ret = new char[2];
strcpy(ret,array);
}
ret[strlen(array)] = a; // (1)
ret[strlen(array)+1] = '\0';
return ret;
}
This only works when the passed array is "" (blank inside). Otherwise it doesn't help (and got an error at (1)). Could you guys please help me with this ? Thanks so much in advanced !
Remove those char * ret declarations inside if blocks which hide outer ret. Therefor you have memory leak and on the other hand un-allocated memory for ret.
To compare a c-style string you should use strcmp(array,"") not array!="". Your final code should looks like below:
char* appendCharToCharArray(char* array, char a)
{
size_t len = strlen(array);
char* ret = new char[len+2];
strcpy(ret, array);
ret[len] = a;
ret[len+1] = '\0';
return ret;
}
Note that, you must handle the allocated memory of returned ret somewhere by delete[] it.
Why you don't use std::string? it has .append method to append a character at the end of a string:
std::string str;
str.append('x');
// or
str += x;
The function name does not reflect the semantic of the function. In fact you do not append a character. You create a new character array that contains the original array plus the given character. So if you indeed need a function that appends a character to a character array I would write it the following way
bool AppendCharToCharArray( char *array, size_t n, char c )
{
size_t sz = std::strlen( array );
if ( sz + 1 < n )
{
array[sz] = c;
array[sz + 1] = '\0';
}
return ( sz + 1 < n );
}
If you need a function that will contain a copy of the original array plus the given character then it could look the following way
char * CharArrayPlusChar( const char *array, char c )
{
size_t sz = std::strlen( array );
char *s = new char[sz + 2];
std::strcpy( s, array );
s[sz] = c;
s[sz + 1] = '\0';
return ( s );
}
The specific problem is that you're declaring a new variable instead of assigning to an existing one:
char * ret = new char[strlen(array) + 1 + 1];
^^^^^^ Remove this
and trying to compare string values by comparing pointers:
if (array!="") // Wrong - compares pointer with address of string literal
if (array[0] == 0) // Better - checks for empty string
although there's no need to make that comparison at all; the first branch will do the right thing whether or not the string is empty.
The more general problem is that you're messing around with nasty, error-prone C-style string manipulation in C++. Use std::string and it will manage all the memory allocation for you:
std::string appendCharToString(std::string const & s, char a) {
return s + a;
}
char ch = 't';
char chArray[2];
sprintf(chArray, "%c", ch);
char chOutput[10]="tes";
strcat(chOutput, chArray);
cout<<chOutput;
OUTPUT:
test
Let's say I have a constant c-style string say
const char* msg = "fred,jim,345,7665";
I'd like to tokenize this and read out the individual fields but for performance reasons I don't want to make a copy. How can I do this?
Obviously strtok takes a non-constant pointer and boost::tokenizer is an option but I am unsure what is doing behind the scenes.
Inevitably you will require some copy of the string, even if it is a substring being copied.
If you have a strtok_r function, you can use that, but it will still require a mutable string to do its work. Beware, however, as not all systems provide the function (e.g. Windows), which is why I've provided an implementation here. It works by requiring an additional parameter: a pointer to a C string to save the address of the next match. This allows for it to be more reentrant (thread-safe) in theory. However, you'll still be mutating the value. You could modify it to suit your needs if you like, perhaps copying N bytes into a destination buffer and null-terminating that buffer to avoid the need to modify the source string.
/*
Usage:
char *tok;
char *savep;
tok = mystrtok_r (somestr, ",", &savep);
while (NULL != tok)
{
/* Do something with `tok'. */
tok = mystrtok_r (NULL, ",", &savep);
}
*/
char *
mystrtok_r (char *str, const char *delims, char **nextp)
{
if (str == NULL)
str = *nextp;
str += strspn (str, delims);
*nextp = str + strcspn (str, delims);
**nextp = 0;
if (*str == 0)
return NULL;
++*nextp;
return str;
}
It depends on how you're going to use it.
If you want to get the next token, and then the next (like an iteration over the string, then you only really need to copy the current token into memory.
long strtok2( char *strDest, const char *strSrc, const char cTok, long lOffset, long lMax)
{
if(lMax > 0)
{
strSrc += lOffset;
char * start = strDest;
while(--lMax && *strSrc != cTok && (*strDest++ = * strSrc++) );
*strDest = 0; //for when the token was found, not the null.
return strDest - start - 1; //the length of the token
}
return 0;
}
I snagged a simple strcpy from http://vijayinterviewquestions.blogspot.com.au/2007/07/implement-strcpy-function.html
const char* msg = "fred,jim,345,7665";
char * buffer[20];
long offset = 0;
while(length = strtok2(buffer, msg, ',', offset, 20))
{
cout << buffer;
offset += (length+1);
}
Well, without a little more detail it's hard to know exactly what you want. I'll guess you are parsing delimited items where consecutive delimiters should be treated as zero length tokens (which is usually correct for comma separated elements). I'm also assuming a blank line counts as a single zero length token. This is how I'd approach it:
const char *token_begin = msg;
int length;
for(;;)
{
length = 0;
while(!isDelimiter(token_begin[length])) //< must include \0 as delimiter
++length;
//..do something here with token. token is at: token_begin[0..length)
if ( token_begin[length] != 0 )
token_begin = &token_begin[length+1]; //skip beyond non-null delimiter
else
break; //token null terminated. exit
}
If you are going to store the tokens somewhere then a copy will be necessary in any case and strtok does this nicely by using the string a placing null terminating character inside it.
The only other option I see to avoid copying it is a lexer which reads the string and through a state machine produces tokens by scanning the string and storing the partial results in a buffer but every token should in any case be stored at least in a null terminated string to you are not really saving anything.
Here is my proposal, my code is structured and use a global variable pos(I know global variable are a bad practice but is only to give you the idea), you can replace it with a data member if you need OOP.
int position, messageLength;
char token[MAX]; // MAX = Value greater than the maximum length
// of the tokens(e.g. 1,000);
bool hasNext()
{
return position < messageLength;
}
char* next(const char* message)
{
int i = 0;
while (position < messageLength && message[position] != ',') {
token[i++] = message[position];
position++;
}
position++; // ',' found
token[i] = '\0';
return token;
}
int main(int argc, char **argv)
{
const char* msg = "fred,jim,345,7665";
position = 0;
messageLength = strlen(msg);
while (hasNext())
cout << next(msg) << endl;
return EXIT_SUCCESS;
}
I've got a function that splits up a string into various sections and then parses them, but when converting a string to char* I get a malformed output.
int parseJob(char * buffer)
{ // Parse raw data, should return individual jobs
const char* p;
int rows = 0;
for (p = strtok( buffer, "~" ); p; p = strtok( NULL, "~" )) {
string jobR(p);
char* job = &jobR[0];
parseJobParameters(job); // At this point, the data is still in good condition
}
return (1);
}
int parseJobParameters(char * buffer)
{ // Parse raw data, should return individual job parameters
const char* p;
int rows = 0;
for (p = strtok( buffer, "|" ); p; p = strtok( NULL, "|" )) { cout<<p; } // At this point, the data is malformed.
return (1);
}
I don't know what happens between the first function calling the second one, but it malforms the data.
As you can see from the code example given, the same method to convert string to char* is used and it works fine.
I'm using Visual Studio 2012/C++, any guidance and code examples will be greatly appreciated.
The "physical" reason your code does not work has nothing to do with std::string or C++. It wouldn't work in pure C as well. strtok is a function that stores its intermediate parsing state in some global variable. This immediately means that you cannot use strtok to parse more than one string at a time. Starting the second parse session before finishing the first would override the internal data stored by the first parse session, thus ruining it beyond repair. In other words, strtok parse sessions must not overlap. In your code they do overlap.
Also, in C++03 the idea of using std::string with strtok directly is doomed from the start. The internal sequence stored in std::string is not guaranteed to be null-terminated. This means that generally &jobR[0] is not a C-string. It can't be used with strtok. To convert a std::string to a C-string you have to use c_str(). But C-string returned by c_str() is non-modifiable.
In C++11 the null-termination is supposed to be visible through the [] operator, but still there seems to be no requirement to store the terminator object contiguously with the actual string, so &jobR[0] is still not a C-string even in C++11. C-string returned by c_str() or data() is non-modifiable.
You cannot use strtok() to parse multiple strings at the same time, like you are doing. The first call to parseJobParameters() in the first loop iteration of parseJob() will alter the internal buffer that strtok() points to, thus the second loop iteration of parseJob() will not be processing the original data anymore. You need to rewrite your code to not use nested calls to strtok() anymore, eg:
#include <vector>
#include <string>
void split(std::string s, const char *delims, std::vector &vec)
{
// alternatively, use s.find_first_of() and s.substr() instead...
for (const char* p = strtok(s.c_str(), delims); p != NULL; p = strtok(NULL, delims))
{
vec.push_back(p);
}
}
int parseJob(char * buffer)
{
std::vector<std::string> jobs;
split(buffer, "~", jobs);
for (std::vector<std::string>::iterator i = jobs.begin(); i != jobs.end(); ++i)
{
parseJobParameters(i->c_str());
}
return (1);
}
int parseJobParameters(char * buffer)
{
std::vector<std::string> params;
split(buffer, "|", params);
for (std::vector<std::string>::iterator i = params.begin(); i != params.end(); ++i)
{
std::cout << *i;
}
return (1);
}
Whilst this will give you the address of the first character in the string char* job = &jobR[0];, it does not give you a valid C-style string. YOu SHOULD use char* job = jobR.c_str();
I'm fairly sure that will solve your problem, but there could of course be something wrong with the way you read the buffer that is passed to parseJob in as well.
Edit: of course, you are also calling strtok from a function that uses strtok. Inside strtok looks a bit like this:
char *strtok(char *str, char *separators)
{
static char *last;
char *found = NULL;
if (!str) str = last;
... do searching for needle, set found to beginning of non-separators ...
if (found)
{
*str = 0; // mark end of string.
}
last = str;
return found;
}
Since "last" gets overwritten when you call parseParameters, you can't use strtok(NULL, ... ) when you get back to parseJobs