weird output when printing data of custom string (c++ newbie) - c++

my main concern is if i am doing this safely, efficiently, and for the most part doing it right.
i need a bit of help writing my implementation of a string class. perhaps someone could help me with what i would like to know?
i am attempting to write my own string class for extended functionality and for learning purposes. i will not use this as a substitute for std::string because that could be potentially dangerous. :-P
when i use std::cout to print out the contents of my string, i get some unexpected output, and i think i know why, but i am not really sure. i narrowed it down to my assign function because any other way i store characters in the string works quite fine. here is my assign function:
void String::assign(const String &s)
{
unsigned bytes = s.length() + 1;
// if there is enough unused space for this assignment
if (res_ >= bytes)
{
strncpy(data_, s.c_str(), s.length()); // use that space
res_ -= bytes;
}
else
{
// allocate enough space for this assignment
data_ = new char[bytes];
strcpy(data_, s.c_str()); // copy over
}
len_ = s.length(); // optimize the length
}
i have a constructor that reserves a fixed amount of bytes for the char ptr to allocate and hold. it is declared like so:
explicit String(unsigned /*rbytes*/);
the res_ variable simply records the passed in amount of bytes and stores it. this is the constructor's code within string.cpp:
String::String(unsigned rbytes)
{
data_ = new char[rbytes];
len_ = 0;
res_ = rbytes;
}
i thought using this method would be a bit more efficient rather than allocating new space for the string. so i can just use whatever spaced i reserved initially when i declared a new string. here is how i am testing to see if it works:
#include <iostream>
#include "./string.hpp"
int main(int argc, char **argv)
{
winks::String s2(winks::String::to_string("hello"));
winks::String s(10);
std::cout << s2.c_str() << "\n" << std::endl;
std::cout << s.unused() << std::endl;
std::cout << s.c_str() << std::endl;
std::cout << s.length() << std::endl;
s.assign(winks::String::to_string("hello")); // Assign s to "hello".
std::cout << s.unused() << std::endl;
std::cout << s.c_str() << std::endl;
std::cout << s.length() << std::endl;
std::cout.flush();
std::cin.ignore();
return 0;
}
if you are concerned about winks::String::to_string, i am simply converting a char ptr to my string object like so:
String String::to_string(const char *c_s)
{
String temp = c_s;
return temp;
}
however, the constructor i use in this method is private, so i am forcing to_string upon myself. i have had no problems with this so far. the reason why i made this is to avoid rewriting methods for different parameters ie: char * and String
the code for the private constructor:
String::String(const char *c_s)
{
unsigned t_len = strlen(c_s);
data_ = new char[t_len + 1];
len_ = t_len;
res_ = 0;
strcpy(data_, c_s);
}
all help is greatly appreciated. if i have no supplied an efficient amount of information please notify me with what you want to know and i will gladly edit my post.
edit: the reason why i am not posting the full string.hpp and string.cpp is because it is rather large and i am not sure if you guys would like that.

You have to make a decision whether you will always store your strings internally terminated with a 0. If you don't store your strings with a terminating zero byte, your c_str function has to add one. Otherwise, it's not returning a C-string.
Your assign function doesn't 0 terminate. So either it's broken, or you didn't intend to 0 terminate. If the former, fix it. If the latter, check your c_str function to make sure it puts a 0 on the end.

Related

When creating a char array, its length is different from required

I need to create a newStr array with length of str array. But after its created the strlen(newStr) is totally different. For example if a strlen(str) is 5, then strlen(newStr) would be 22. What am I doing wrong?
#include <iostream>
using namespace std;
int main()
{
char *str = "Hello";
int strLength = strlen(str);
std::cout << "str = " << str << "\t" << "strLength = " << strLength << std::endl;
char *newStr = new char[strLength];
std::cout << "newStrLength = " << strlen(newStr) << std::endl;
system("pause");
return 0;
}
In the console will be
str = Hello strLength = 5
newStrLength = 22
You are mixing up two different concepts:
new[] allocates uninitialized memory block to your program,
strlen(...) counts characters in a C string before null terminator '\0' is reached.
The size of the allocated block cannot be measured with strlen. In fact, it cannot be measured at all - your program must know how much memory it has requested, and make sure that it does not go past the limit.
Once you allocated new char[n], you can safely copy a C string of length up to n-1 into that block. C++ guarantees that enough memory would be there for you to complete the operation successfully:
char *newStr = new char[strLength+1]; // Note +1 for null terminator
strcpy(newStr, str);
std::cout << "newStrLength = " << strlen(newStr) << std::endl;
delete[] newStr;
The way strlen works is that it examines the contents of the string passed to it, and counts how many characters there are until the first terminating character. The terminating character for a string is '\0' (or 0).
What you've done is asked for the length of a string that you've not assigned any value to; leading to strlen examining random memory; looking for the first 0. In this case, it found it 22 bytes further down; but it could be anything. It could even crash because you start looking into memory you don't have read access to.
The best way to resolve this is to use std::string and then you can call length and other helper functions without having to worry about the underlying pointers too much; which will also resolve your memory leak.

Counting characters in a character pointer

I'm brushing up on some C++ and so one of the problems I'm trying to solve is counting characters from a character pointer and check it against what I expect to see. However in my solution, I noticed a peculiar result. I passed in a reference to a char to my function and it returned a count of 3. Why would the reference test return back a count of 3 for a reference to a character?
I realize the character doesn't have a null terminator and so the code keeps counting but it does eventually return a result, so that means the solution falls short. Any ideas to make it more robust? Here is my solution and result.
CountCharacters.cpp
#include <cstdio>
#include <iostream>
#define ASSERT_EQUALS(paramx1, paramx2) \
{\
int param1 = paramx1;\
int param2 = paramx2;\
if (param1==param2)\
std::cout << "PASS! param1=" << param1 << " param2=" << param2 << std::endl;\
else\
std::cout << "FAIL! param1=" << param1 << " param2=" << param2 << std::endl;\
}
int countCharacters(const char * characters);
int main()
{
char character = '1';
ASSERT_EQUALS(countCharacters("string8\0"), 7);
ASSERT_EQUALS(countCharacters("\0"), 0);
ASSERT_EQUALS(countCharacters(""), 0);
ASSERT_EQUALS(countCharacters(NULL), 0);
ASSERT_EQUALS(countCharacters(&character), 1);
ASSERT_EQUALS(countCharacters('\0'), 0);
return 0;
}
int countCharacters(const char * characters)
{
if (!characters) return 0;
int count = 0;
const char * mySpot = characters;
while (*(mySpot) != '\0')
{
std::cout << "Count=" << count << " mySpot=" << *(mySpot) << std::endl;
count++;
mySpot++;
}
return count;
}
Results:
PASS! param1=7 param2=7
PASS! param1=0 param2=0
PASS! param1=0 param2=0
PASS! param1=0 param2=0
FAIL! param1=2 param2=1
PASS! param1=0 param2=0
You're not passing a reference to a character. You're passing a pointer. Specifically, this is a pointer:
&character
The & and * symbols are a bit confusing when learning c++. Depending on where their located, they can be a pointer or a reference:
char character = '1'; // <- char variable
char* characterPtr = &character; // <- pointer to the char variable
char& characterRef = *characterPtr; // <- reference to the char variable
So, you're passing a pointer to a character and your function is treating it like the head of a string and counting characters until it hits a nullptr. There just happened to be one a few chars away, which is why you're getting the value 2.
EDIT: C/C++ has no native string type, just character types. So you need libraries like the ones you're including to treat characters like heads of strings. The convention is that a nullptr terminates the string. So, you're exercising that convention nicely, but also demonstrating the issue that there's no difference between the pointer to a character and the pointer to the character at the head of a string, so its easy to accidentally pass a pointer to a character to a function that's expecting a string. Things get really interesting if the function starts copying characters into that 'string' because it assumes you allocated that memory, but it could be other data that then gets squashed.
Aside from being dangerous, the other major downside of using character strings is they're tedious to manipulate them, since there's no native functions. So, nice libraries like STL have been written to solve these problems. They don't require pointers, so are a lot safer to use (you can use references instead and do bounds checking), and they have a lot of built in methods, so cut down on the amount of coding you need to do.

Read into std::string using scanf

As the title said, I'm curious if there is a way to read a C++ string with scanf.
I know that I can read each char and insert it in the deserved string, but I'd want something like:
string a;
scanf("%SOMETHING", &a);
gets() also doesn't work.
Thanks in advance!
this can work
char tmp[101];
scanf("%100s", tmp);
string a = tmp;
There is no situation under which gets() is to be used! It is always wrong to use gets() and it is removed from C11 and being removed from C++14.
scanf() doens't support any C++ classes. However, you can store the result from scanf() into a std::string:
Editor's note: The following code is wrong, as explained in the comments. See the answers by Patato, tom, and Daniel Trugman for correct approaches.
std::string str(100, ' ');
if (1 == scanf("%*s", &str[0], str.size())) {
// ...
}
I'm not entirely sure about the way to specify that buffer length in scanf() and in which order the parameters go (there is a chance that the parameters &str[0] and str.size() need to be reversed and I may be missing a . in the format string). Note that the resulting std::string will contain a terminating null character and it won't have changed its size.
Of course, I would just use if (std::cin >> str) { ... } but that's a different question.
Problem explained:
You CAN populate the underlying buffer of an std::string using scanf, but(!) the managed std::string object will NOT be aware of the change.
const char *line="Daniel 1337"; // The line we're gonna parse
std::string token;
token.reserve(64); // You should always make sure the buffer is big enough
sscanf(line, "%s %*u", token.data());
std::cout << "Managed string: '" << token
<< " (size = " << token.size() << ")" << std::endl;
std::cout << "Underlying buffer: " << token.data()
<< " (size = " << strlen(token.data()) << ")" << std::endl;
Outputs:
Managed string: (size = 0)
Underlying buffer: Daniel (size = 6)
So, what happened here?
The object std::string is not aware of changes not performed through the exported, official, API.
When we write to the object through the underlying buffer, the data changes, but the string object is not aware of that.
If we were to replace the original call: token.reseve(64) with token.resize(64), a call that changes the size of the managed string, the results would've been different:
const char *line="Daniel 1337"; // The line we're gonna parse
std::string token;
token.resize(64); // You should always make sure the buffer is big enough
sscanf(line, "%s %*u", token.data());
std::cout << "Managed string: " << token
<< " (size = " << token.size() << ")" << std::endl;
std::cout << "Underlying buffer: " << token.data()
<< " (size = " << strlen(token.data()) << ")" << std::endl;
Outputs:
Managed string: Daniel (size = 64)
Underlying buffer: Daniel (size = 6)
Once again, the result is sub-optimal. The output is correct, but the size isn't.
Solution:
If you really want to make do this, follow these steps:
Call resize to make sure your buffer is big enough. Use a #define for the maximal length (see step 2 to understand why):
std::string buffer;
buffer.resize(MAX_TOKEN_LENGTH);
Use scanf while limiting the size of the scanned string using "width modifiers" and check the return value (return value is the number of tokens scanned):
#define XSTR(__x) STR(__x)
#define STR(__x) #x
...
int rv = scanf("%" XSTR(MAX_TOKEN_LENGTH) "s", &buffer[0]);
Reset the managed string size to the actual size in a safe manner:
buffer.resize(strnlen(buffer.data(), MAX_TOKEN_LENGTH));
The below snippet works
string s(100, '\0');
scanf("%s", s.c_str());
Here a version without limit of length (in case of the length of the input is unknown).
std::string read_string() {
std::string s; unsigned int uc; int c;
// ASCII code of space is 32, and all code less or equal than 32 are invisible.
// For EOF, a negative, will be large than 32 after unsigned conversion
while ((uc = (unsigned int)getchar()) <= 32u);
if (uc < 256u) s.push_back((char)uc);
while ((c = getchar()) > 32) s.push_back((char)c);
return s;
}
For performance consideration, getchar is definitely faster than scanf, and std::string::reserve could pre-allocate buffers to prevent frequent reallocation.
You can construct an std::string of an appropriate size and read into its underlying character storage:
std::string str(100, ' ');
scanf("%100s", &str[0]);
str.resize(strlen(str.c_str()));
The call to str.resize() is critical, otherwise the length of the std::string object will not be updated. Thanks to Daniel Trugman for pointing this out.
(There is no off-by-one error with the size reserved for the string versus the width passed to scanf, because since C++11 it is guaranteed that the character data of std::string is followed by a null terminator so there is room for size+1 characters.)
int n=15; // you are going to scan no more than n symbols
std::string str(n+1); //you can't scan more than string contains minus 1
scanf("%s",str.begin()); // scanf only changes content of string like it's array
str=str.c_str() //make string normal, you'll have lots of problems without this string

How can a conversion from string to char[] make it longer?

I have a problem I cannot really understand how it could exist.
I have a bunch of files ordered by time and containing a bunch of objects. The result should be one file per time ordered in a directory per object.
It works quite fine but at the point where I convert the Outputstring to a char[] to use fstream.open(), the array has 3 characters more than the string has.
#include <iostream>
#include <stdio.h>
#include <string.h>
using namespace std;
int main()
{
string strOutput;
char *OutputFile;
short z;
strOutput = "/home/.../2046001_2013-02-25T0959.txt";
cout << strOutput << endl;
OutputFile = new char[strOutput.length()];
z = 0;
while (z < strOutput.length())
{
OutputFile[z] = strOutput[z];
z++;
}
cout << OutputFile << endl;
return 0;
}
The first output is always correct but the second sometimes has the end .txt60A, .txt5.a or .txt9.A.
When it occurs its always the same object and time and it happens every try. But not every object does that.
For obvious reasons I cannot reproduce this error in this minimal code snippet, but I also don't want to post the whole 390 lines of code.
Do you have any suggestions?
You are missing terminating null at the end of C string. To fix:
OutputFile = new char[strOutput.length() + 1]; // notice +1
z = 0;
while (z < strOutput.length())
{
OutputFile[z] = strOutput[z];
z++;
}
OutputFile[z] = 0; // add terminating 0 byte
Of course there are better ways to do the whole thing... you don't really need to copy at all, just get rid of OutputFile and the whole loop, and use the char array inside std::string:
cout << strOutput.c_str() << endl;
I assume the real code wants a C string. std::cout can print std::string directly, of course:
cout << strOutput << endl;
If you actually want to create a copy, it's best to just copy std::string and store that, and use c_str-method to get the C buffer when you need it:
string OutputFile = strOutput;
If you know you really do need a raw char array allocated from heap, you should use std::unique_ptr (or possibly some other C++ smart pointer class) to wrap the pointer, so you do not need to delete manually and avoid memory leaks, and also use standard library function to do copying:
#include <memory>
#include <cstring>
...
unique_ptr<char[]> OutputFile(new char[strOutput.length() + 1];
::strcpy(OutputFile, strOutput.c_str()); // :: means top level namespace
Char arrays need an extra null character or \0 appended to the end, otherwise the code reading the string will run past the end of the array until it finds one.
OutputFile = new char[strOutput.length() + 1];
z = 0;
while (z < strOutput.length())
{
OutputFile[z] = strOutput[z];
z++;
}
OutputFile[z] = '\0';
It may appear to work if the next byte after the array happens to be a null, but that's just a coincidence. I'm sure that's why your code works on the first pass.
at the point where I convert the Outputstring to a char[] to use fstream.open()
You don't have to do that. Do something like this instead:
outfile.open(Outputstring.c_str(), std::fstream::out)
Of course, if you have a C++11-compliant compiler, you can just do:
outfile.open(Outputstring, std::fstream::out)

how to print char array in c++

how can i print a char array such i initialize and then concatenate to another char array? Please see code below
int main () {
char dest[1020];
char source[7]="baby";
cout <<"source: " <<source <<endl;
cout <<"return value: "<<strcat(dest, source) <<endl;
cout << "pointer pass: "<<dest <<endl;
return 0;
}
this is the output
source: baby
return value: v����baby
pointer pass: v����baby
basically i would like to see the output print
source: baby
return value: baby
pointer pass: baby
You haven't initialized dest
char dest[1020] = ""; //should fix it
You were just lucky that it so happened that the 6th (random) value in dest was 0. If it was the 1000th character, your return value would be much longer. If it were greater than 1024 then you'd get undefined behavior.
Strings as char arrays must be delimited with 0. Otherwise there's no telling where they end. You could alternatively say that the string ends at its zeroth character by explicitly setting it to 0;
char dest[1020];
dest[0] = 0;
Or you could initialize your whole array with 0's
char dest[1024] = {};
And since your question is tagged C++ I cannot but note that in C++ we use std::strings which save you from a lot of headache. Operator + can be used to concatenate two std::strings
Don't use char[]. If you write:
std::string dest;
std::string source( "baby" )
// ...
dest += source;
, you'll have no problems. (In fact, your problem is due to the fact
that strcat requires a '\0' terminated string as its first argument,
and you're giving it random data. Which is undefined behavior.)
your dest array isn't initialized. so strcat tries to append source to the end of dest wich is determined by a trailing '\0' character, but it's undefined where an uninitialized array might end... (if it does at all...)
so you end up printing more or less random characters until accidentially a '\0' character occurs...
Try this
#include <iostream>
using namespace std;
int main()
{
char dest[1020];
memset (dest, 0, sizeof(dest));
char source[7] = "baby";
cout << "Source: " << source << endl;
cout << "return value: " << strcat_s(dest, source) << endl;
cout << "pointer pass: " << dest << endl;
getchar();
return 0;
}
Did using VS 2010 Express.
clear memory using memset as soon as you declare dest, it's more secure. Also if you are using VC++, use strcat_s() instead of strcat().