How can one implement a custom string class using the STL? - c++

In C++ - The Complete Reference, the author gives us a challenge after showing how he implements a custom C++ string class. Excerpt from the book:
A Challenge:
Try implementing StrType (the string class) using the STL. That is, use a container to store the characters that comprise a string. Use iterators to operate on the strings, and use the algorithms to perform the various string manipulations.
I understand the basic concept here, but am having trouble implementing it. should I do std::vector < char > and push_back for every char or something like that? What about the string manipulations? Need some help. Sample code will be accepted gratefully, or you can explain how I may be able to implement this.

Yes, std::vector<char> sounds like a great idea. It will save you from the troubles of writing a custom destructor, copy constructor and copy assignment operator. Plus all the iterator member functions (begin, end and co.) can just delegate to the std::vector<char> versions.
can u give some code on how to do string manipulations? e.g concatenation ?
Sure thing, here is how I would overload operator+= and operator+ for the string type:
class StrType
{
std::vector<char> vec;
public:
// ...
StrType& operator+=(const StrType& rhs)
{
vec.insert(vec.end(), rhs.vec.begin(), rhs.vec.end());
return *this;
}
};
StrType operator+(StrType lhs, const StrType& rhs)
{
lhs += rhs;
return lhs;
}
There's probably a more efficient version of operator+, but you can figure that out on your own.

Using std::vector<char> would probably be the best container to use in this case (random access iterators and low overhead make it an attractive choice for a string).
Further to your comment on FredOverflow's answer, you can perform a string concatenation as follows:
std::vector<char> firstString;
firstString.push_back('A');
firstString.push_back('B');
std::vector<char> secondString;
secondString.push_back('X');
secondString.push_back('Y');
firstString.insert( firstString.end(), secondString.begin(), secondString.end() );
for( auto it = firstString.begin(); it != firstString.end(); ++it )
{
std::cout << (*it);
}
In this case this would print out: ABXY. You can see it here: http://ideone.com/OmdoU

Related

Effective construction std::string from std::unordered_set<char>

I have an unordered_set of chars
std::unordered_set<char> u_setAlphabet;
Then I want to get a content from the set as std::string. My implementation now looks like this:
std::string getAlphabet() {
std::string strAlphabet;
for (const char& character : u_setAlphabet)
strAlphabet += character;
return strAlphabet;
}
Is this a good way to solve this task? The additions of signle chars to string seems not to be optimal for large u_setAlphabet (multiple reallocs?). Is there any other method to it?
The simplest, most readable and most efficient answer is:
return std:string(s.begin(), s.end());
The implementation may choose to detect the length of the range up-front and only allocate once; both libc++ and libstdc++ do this when given a forward iterator range.
The string class also offers you reserve, just like vector, to manage the capacity:
std::string result
result.reserve(s.size());
for (unsigned char c : s) result.push_back(c); // or std::copy
return result;
It also offers assign, append and insert member functions, but since those offer the strong exception guarantee, they may have to allocate a new buffer before destroying the old one (thanks to #T.C. for pointing out this crucial detail!). The libc++ implementation does not reallocate if the existing capacity suffices, while GCC5's libstdc++ implementation reallocates unconditionally.
std::string has a constructor for that:
auto s = std::string(begin(u_setAlphabet), end(u_setAlphabet));
It is better to use the constructor that acepts iterators. For example
std::string getAlphabet() {
return { u_setAlphabet.begin(), u_setAlphabet.end() };
}
Both return std::string(u_setAlphabet.begin(), u_setAlphabet.end()); and return { u_setAlphabet.begin(), u_setAlphabet.end(); are the same in C++11. I prefer #VladfromMoscow solution because we do not need to make any assumption about the returned type of the temporary object.

Interpret a std::string as a std::vector of char_type?

I have a template<typename T> function that takes a const vector<T>&. In said function, I have vectors cbegin(), cend(), size(), and operator[].
As far as I understand it, both string and vector use contiguous space, so I was wondering if I could reuse the function for both data types in an elegant manner.
Can a std::string be reinterpreted as a std::vector of (the appropriate) char_type? If so, what would the limitations be?
If you make your template just for type const T& and use the begin(), end(), etc, functions which both vector and string share then your code will work with both types.
Go STL way and use iterators. Accept iterator to begin and iterator to end. It will work with all possible containers, including non-containers like streams.
There is no guarantee the layout of string and vector will be the same. They theoretically could be, but they probably aren't in any common implementation. Therefore, you can't do this safely. See Zan's answer for a better solution.
Let me explain: If I am a standard library implementer and decide to implement std::string like so....
template ...
class basic_string {
public:
...
private:
CharT* mData;
size_t mSize;
};
and decide to implement std::vector like so...
template ...
class vector {
public:
...
private:
T* mEnd;
T* mBegin;
};
When you reinterpret_cast<string*>(&myVector) you wind up interpreting the pointer to the end of your data as the pointer to the start of your data, and the pointer to the start of your data to the size of your data. If the padding between members is different, or there are extra members, it could get even weirder and more broken than that too.
So yes, in order for this to possibly work they both need to store contiguous data, but they also need quite a bit else to be the same between the implementations for it to work.
std::experimental::array_view<const char> n4512 represents a contiguous buffer of chars.
Writing your own is not hard, and it solves this problem and (in my experience) many more.
Both string and vector are compatible with an array view.
This lets you move your implementation into a .cpp file (and not expose it), gives you the same performance as doing it with std::vector<T> const& and probably the same implementation, avoids duplicating code, and uses light weight contiguous buffer type erasure (which is full of tasty keywords).
If the key point is that you want to access a continuous area in memory where instances of a specific char type are stored then you could define your function as
void myfunc(const CType *p, int size) {
...
}
to make it clear that you assume they must be adjacent in memory.
Then for example to pass the content of a vector the code is simply
myfunc(&myvect[0], myvect.size());
and for a string
myfunc(mystr.data(), mystr.size());
or
myfunc(buffer, n);
for an array.
You can't directly typecast a std::vector to a std::string or vice versa. But using the iterators that STL containers provide does allow you to iterate both a vector and a string in the same way. And if your function requires random access of the container in question then either would work.
std::vector<char> str1 {'a', 'b', 'c'};
std::string str2 = "abc";
template<typename Iterator>
void iterator_function(Iterator begin, Iterator end)
{
for(Iterator it = begin; it != end; ++it)
{
std::cout << *it << std::endl;
}
}
iterator_function(str1.begin(), str1.end());
iterator_function(str2.begin(), str2.end());
Both of those last two function calls would print the same thing.
Now if you wanted to write a generic version that parsed only characters only stored in a string or in a vector you could write something that iterated the internal array.
void array_function(const char * array, unsigned length)
{
for(unsigned i = 0; i < length; ++i)
{
std::cout << array[i] << std::endl;
}
}
Both functions would do the same thing in the following scenarios.
std::vector<char> str1 {'a', 'b', 'c'};
std::string str2 = "abc";
iterator_function(str1.begin(), str1.end());
iterator_function(str2.begin(), str2.end());
array_function(str1.data(), str1.size());
array_function(str2.data(), str2.size());
There are always multiple ways to solve a problem. Depending on what you have available any number of solutions might work. Try both and see which works better for your application. If you don't know the iterator type then the char typed array iteration is useful. If you know you will always have the template type to pass in then the template iterator method might be more useful.
The way your question is put at the moment is a bit confusing. If you mean to be asking "is it safe to cast a std::vector type to a std::string type or vice versa if the vector happens to contain char values of the appropriate type?", the answer is: no way, don't even think about it! If you're asking: "can I access the contiguous memory of non-empty sequences of char type if they're of the type std::vector or std::string?" then the answer is, yes you can (with the data() member function).

Is there a CompareTo method in C++ similar to Java where you can use > < = operations on a data type

I know that in java there is a compareTo method that you can write in a class that will compare two variables and return a value -1, 1, or 0 signifing greater than, less than, and equal to operations. Is there a way to do this in C++?
Background:
Im creating a modified string class in which it takes a string and an arraylist. I want to be able to compare the string in a traditional fashion where if its lower in the alphabet it will be less than, than higher it would be greater than. Than i just want the array list to be linked to the files to store pages in which the word was indexed on in a text file. Anyways the specifics do not matter since i already have the class written. I just need to create compareTo method that would be able to be used in the main of my cpp file or by other data type like various trees for instance.
Ill write the code in java as i know how and maybe someone can help me with C++ Syntax (im required to write in c++ for this project unfortunatly, and i am new to C++)
I will shorten the code to give the rough outline of what im doing than write the compareTo method as i know how in java
class name ModifiedString
Has variables: word , arraylist pagelist
Methods:
getWord (returns the word associated with the class, i.e its string)
appendPageList (adds page numbers to the array list, this doesnt matter in this question)
Hers how i would do it in java
int compareTo(ModifiedString a){
if(this.getWord() > a.getWord())
return 1;
else if (this.word() < a.getWord())
return -1;
else return 0;
}
Then when < , > , or == is used on a ModifiedWord than the operations would be valid.
std::string already includes a working overload of operator<, so you can just compare strings directly. Java uses compareTo primarily because the built-in comparison operator produces results that aren't generally useful for strings. Being a lower-level language, Java doesn't support user-defined operator overloads, so it uses compareTo as a band-aid to cover for the inadequacy of the language.
From your description, however, you don't need to deal with any of that directly at all. At least as you've described the problem, you really want is something like:
std::map<std::string, std::vector<int> > page_map;
You'll then read words in from your text file, and insert the page number where each occurs into the page map:
page_map[current_word].push_back(current_page);
Note that I've used std::map above, on the expectation that you may want ordered results (e.g., be able to quickly find all words from age to ale in alphabetical order). If you don't care about ordering, you may want to use std::unordered_map instead.
Edit: here's a simple text cross-reference program that reads a text file (from standard input) and writes out a cross-reference by line number (i.e., each "word", and the numbers of the lines on which that word appeared).
#include <map>
#include <unordered_map>
#include <iostream>
#include <string>
#include <vector>
#include <sstream>
#include <iterator>
#include "infix_iterator.h"
typedef std::map<std::string, std::vector<unsigned> > index;
namespace std {
ostream &operator<<(ostream &os, index::value_type const &i) {
os << i.first << ":\t";
std::copy(i.second.begin(), i.second.end(),
infix_ostream_iterator<unsigned>(os, ", "));
return os;
}
}
void add_words(std::string const &line, size_t num, index &i) {
std::istringstream is(line);
std::string temp;
while (is >> temp)
i[temp].push_back(num);
}
int main() {
index i;
std::string line;
size_t line_number = 0;
while (std::getline(std::cin, line))
add_words(line, ++line_number, i);
std::copy(i.begin(), i.end(),
std::ostream_iterator<index::value_type>(std::cout, "\n"));
return 0;
}
If you look at the first typedef (of index), you can change it from map to unordered_map if you want to test a hash table vs. a red-black tree. Note that this interprets "word" pretty loosely -- basically any sequence of non-whitespace characters, so for example, it'll treat example, as a "word" (and it'll be separate from example).
Note that this uses the infix_iterator I've posted elsewhere.
There is no standard way in C++ to define an operator that does what the Java compareTo() function does. You can, however, implement
int compareTo(const ModifiedString&, const ModifiedString&);
Another option is to overload the <, <=, >, >=, == and != operators, e.g. by implementing
bool operator<(const ModifiedString&, const ModifiedString&);
In C++, you define bool operator< directly, no need to invent funny names, same for operator< and operator==. They're generally implemented as member functions taking one extra argument, the righthand side, but you could also define them as non-member functions taking two arguments.
Sun decided to not include operator overloading in Java, so them provided an in-class way (through member functions) to do that job: The equals() and compareTo() functions.
C++ has operator overloading, which allows you to specify the behaviour of the language operators within your own types.
To learn how to overload operators, I suggest you to read this thread: Operator overloading

Idiomatic C++ for finding a range of equal length strings, given a vector of strings (ordered by length)

given a std::vector< std::string >, the vector is ordered by string length, how can I find a range of equal length strength?
I am looking forward an idiomatic solution in C++.
I have found this solution:
// any idea for a better name? (English is not my mother tongue)
bool less_length( const std::string& lhs, const std::string& rhs )
{
return lhs.length() < rhs.length();
}
std::vector< std::string > words;
words.push_back("ape");
words.push_back("cat");
words.push_back("dog");
words.push_back("camel");
size_t length = 3;
// this will give a range from "ape" to "dog" (included):
std::equal_range( words.begin(), words.end(), std::string( length, 'a' ), less_length );
Is there a standard way of doing this (beautifully)?
I expect that you could write a comparator as follows:
struct LengthComparator {
bool operator()(const std::string &lhs, std::string::size_type rhs) {
return lhs.size() < rhs;
}
bool operator()(std::string::size_type lhs, const std::string &rhs) {
return lhs < rhs.size();
}
bool operator()(const std::string &lhs, const std::string &rhs) {
return lhs.size() < rhs.size();
}
};
Then use it:
std::equal_range(words.begin(), words.end(), length, LengthComparator());
I expect the third overload of operator() is never used, because the information it provides is redundant. The range has to be pre-sorted, so there's no point the algorithm comparing two items from the range, it should be comparing items from the range against the target you supply. But the standard doesn't guarantee that. [Edit: and defining all three means you can use the same comparator class to put the vector in order in the first place, which might be convenient].
This works for me (gcc 4.3.4), and while I think this will work on your implementation too, I'm less sure that it is actually valid. It implements the comparisons that the description of equal_range says will be true of the result, and 25.3.3/1 doesn't require that the template parameter T must be exactly the type of the objects referred to by the iterators. But there might be some text I've missed which adds more restrictions, so I'd do more standards-trawling before using it in anything important.
Your way is definitely not unidiomatic, but having to construct a dummy string with the target length does not look very elegant and it isn't very readable either.
I'd perhaps write my own helper function (i.e. string_length_range), encapsulating a plain, simple loop through the string list. There is no need to use std:: tools for everything.
std::equal_range does a binary search. Which means the words vector must be sorted, which in this case means that it must be non-decreasing in length.
I think your solution is a good one, definitely better than writing your own implementation of binary search which is notoriously error prone and hard to prove correct.
If doing a binary search was not your intent, then I agree with Alexander. Just a simple for loop through the words is the cleanest.

C++ using STL List, how to copy an existing list into a new list

Right now I'm working with a copy constructor for taking a list called val of type char, and I need to take all the elements of a string v that is passed into the copy constructor and put them into the val list.
Public:
LongInt(const string v);
Private:
list<char> val;
So here in the public section of the LongInt class I have a copy constructor which takes the val list and copies the v string into it. Can anyone help me figure out how to do this? Thanks in advance!
You'll have to iterate over the string and extract the data character by character. Using the std::copy algorithm should work:
std::copy(v.begin(), v.end(), std::back_inserter(val));
In your LongInt constructor just use the iterator, iterator list constructor:
LongInt(const string v) : val(v.begin(), v.end()) { }
That being said, have you considered actually using string or possibly deque<char> to manipulate your sequence rather than list? Depending on your needs, those alternatives might be better.
LongInt::LongInt( const string v ) : val(v.begin(), v.end())
{
}
First, use std::string if it's a string you're storing. It's a container like any other. If you can't or don't want to store a string, use std::vector. But that would boil down to a less-functional std::string anyway, so just use std::string.
For the copying:
std::copy( v.begin(), v.end(), std::back_inserter(val) );
But just use a std::string if it's a list of chars you're storing.