C++ empty and array index

C++ empty and array index - c++

Is it possible to do something like:
string word = "Hello";
word[3] = null;
if(word[3] == null){/.../}
in C++, basically making an array element empty. For example if I wanted to remove the duplicate characters from the array I'd set them to null first and then shifted the array to the left every time I found an array index that contained null.
If this is not possible what's a good way of doing something like this in C++ ?

If you want to remove adjacent duplicate characters, you can do this:
std::string::iterator new_end = std::unique(word.begin(), word.end());
word.erase(new_end, word.end());
If you want to mark arbitrary characters for removal, you can skip the marking and just provide the appropriate predicate to std::remove_if:
new_end = std::remove_if(word.begin(), word.end(), IsDuplicate);
word.erase(new_end, word.end());
However, I can't think of an appropriate predicate to use here that doesn't exhibit undefined behavior. I would just write my own algorithm:
template<typename IteratorT>
IteratorT RemoveDuplicates(IteratorT first, IteratorT last)
{
typedef typename std::iterator_traits<IteratorT>::value_type
ValueT;
std::map<ValueT, int> counts;
for (auto scan=first; scan!=last; ++scan)
{
++counts[*scan];
if(counts[*scan] == 1)
{
*first = std::move(*scan);
++first;
}
}
return first;
}
Or, if you don't care about the order of the elements, you could simply sort it, then use the first solution.

This is possible, since a single element of a string is an element within a char-array and thus representable as pointer, i. e. you can retrieve the address of the element. Therefore you can set word[3] = null. Your if-construct is valid but the compiler prints a warning, this is because NULL is only a pointer constant. Alternatives would be: if (!word[3]) or if(word[3] == 0).
But in any case you should consider using STL algorithms for removing duplicates.

I think you should take a look at the algorithm in the STL.
You are not very specific about what you want to remove but maybe this helps:
std::string string_with_dup("AABBCCDD");
std::string string_without_dup;
std::cout << string_with_dup << std::endl;
// with copy
std::unique_copy(string_with_dup.begin(), string_with_dup.end(), std::back_inserter(string_without_dup));
std::cout << string_without_dup << std::endl;
// or inplace
string_with_dup.erase(std::unique(string_with_dup.begin(), string_with_dup.end()), string_with_dup.end());
std::cout << string_with_dup << std::endl;

If you want to remove all duplicates (not only the adjacent ones, you should use the erase-remove idiom with something like this
#include <iostream>
#include <map>
#include <string>
#include <algorithm>
using namespace std;
struct is_repeated {
is_repeated( map<char,int>& x ) :r(&x) {};
map<char,int>* r;
bool operator()( char c ) {
(*r)[c]++;
if( (*r)[c] > 1 )
return true;
return false;
}
};
int main (int argc, char**argv)
{
map<char,int> counter_map;
string v = "hello hello hello hello hello hello hello";
cout << v << endl;
is_repeated counter(counter_map);
v.erase( remove_if(v.begin(), v.end(), counter ), v.end() );
cout << v << endl;
}
outputs (as of this):
hello hello hello hello hello hello hello
helo

Related

In C++11, how to find and return all the item(s) in a vector of strings that start with a given string?

(Note: When I refer to vectors, I'm referring to the vector class provided by <vector>.)
The problem
Given a string x and a vector of strings, how can I retrieve the string(s) in the vector that start with x? Preferably in a way that is time-efficient?
That is, if x is "apple" and the vector is vector<string> foods = {"apple pie","blueberry tarts","cherry cobbler"}, then it should return "apple pie" in some capacity.
I am using C++11 and I'm not an expert on it, so simple answers with explanations would be much appreciated. Forgive me if the answer is obvious - I am relatively new to the language.
Possible solutions I've considered
The obvious solution would be to just create an iterator and iterate through each string in the vector, pulling out all items that start with the given string using the overloaded version of rfind that has the pos parameter. (That is, like this: str.rfind("start",0))
However, with a large vector this is time-inefficient, so I'm wondering if there is a better way to do this, i.e. sorting the vector and using some kind of binary search, or perhaps modifying the find method from <algorithm>?

The simplest way to copy desired strings would be a simple linear scan. For example, using the standard library std::copy_if to perform the copying and a lambda to encapsulate the "starts with" string comparison.
#include <algorithm>
#include <iostream>
#include <string>
#include <vector>
int main()
{
std::vector<std::string> foods = { "apple pie","blueberry tarts","cherry cobbler" };
std::string prefix{ "apple" };
auto starts_with = [&prefix](const std::string &str) {
return str.compare(0, prefix.size(), prefix) == 0;
};
std::vector<std::string> result;
std::copy_if(begin(foods), end(foods), back_inserter(result), starts_with);
for (const auto &str : result) {
std::cout << str << '\n';
}
}

A good way to solve your problem would be to use binary search. Note that this requires sorting the vector of strings first, which gives the algorithm a time complexity of NlogN.
vector <string> v = {"a", "apple b", "apple c", "d"}; // stuff
string find = "apple";
// create a second vector that contains the substrings of the first vector
vector <pair<string, string>> v2;
for(string item : v){
v2.push_back({item.substr(0, find.size()), item});
}
sort(v2.begin(), v2.end());
// binary search to find the leftmost and rightmost occurrence of find
int l = v.size()-1, r = 0;
for(int i = v.size()/2; i >= 1; i /= 2){
while(l-i >= 0 && v2[l-i].first >= find){l -= i;}
while(r+i < v.size() && v2[r+i].first <= find){r += i;}
}
if(v2[l].first == find){
for(int i = l; i <= r; ++i){
cout << v2[i].second << endl;
}
}
else{
cout << "No matches were found." << endl;
}
In my code, we first create a second vector called v2 to store pairs of strings. After sorting it, we implement binary search by jumps to find the leftmost and rightmost occurrences of find. Lastly, we check if there are any occurrences at all (this is an edge case), and print all the found strings if occurrences exist.

You can do this in a single pass over the vector. This is the best you'll do unless the vector is pre-sorted, since the cost of sorting will outweigh any gain you get from using a binary search.
Using std::copy_if makes this pretty simple:
#include <string>
#include <vector>
#include <algorithm>
int main() {
std::vector<std::string> v = {
"apple pie",
"blueberry tarts",
"apple",
"cherry cobbler",
"pie"
};
std::vector<std::string> v2;
std::string to_find = "apple";
std::copy_if(
v.begin(),
v.end(),
std::back_inserter(v2),
[&to_find](const std::string& el) {
return el.compare(0, to_find.size(), to_find) == 0;
}
);
}
Live Demo
This will copy all elements from v that match the predicate function into v2. The predicate simply checks that the first to_find.size() characters of each element match the string to find using std::string::compare (overload (2) on that page).

Is it possible to remove elements from a vector of shared_ptr?

Say I have
vector<shared_ptr<string>> enemy;
how do I remove elements from the enemy vector?
Thanks for your help in advance
**Edit (code in context)
void RemoveEnemy( vector<shared_ptr<Enemy>> & chart, string id )
{
int i = 0;
bool found = FALSE;
for(auto it = chart.begin(); it != chart.end(); i++)
{
if(id == chart[i]->GetEnemyID() )
{
found = TRUE;
chart.erase(it);
}
}
the code above segfaults me

You remove elements the same way you remove any elements from any std::vector - via the std::vector::erase() method, for instance. All you need for that is an iterator to the desired element to remove.
In your case, since you are storing std::shared_ptr<std::string> objects rather than storing actual std::string objects, you may need to use something like std::find_if() to find the vector element containing the desired string value, eg:
void removeEnemy(string name)
{
auto iter = std::find_if(enemy.begin(), enemy.end(),
[&](auto &s){ return (*s == name); }
);
if (iter != enemy.end())
enemy.erase(iter);
}
UPDATE: in the new code you have added, you are incorrectly mixing indexes and iterators together. You are creating an infinite loop if the vector is not empty, as you never increment the it iterator that controls your loop, you are incrementing your index i variable instead (see what happens when you don't give your variables unique and meaningful names?). So you end up going out of bounds of the vector into surrounding memory. That is why you get the segfault error.
Even though you are (trying to) use an iterator to loop through the vector, you are using indexes to access the elements, instead of dereferencing the iterator to access the elements. You don't need to use indexes at all in this situation, the iterator alone will suffice.
Try this instead:
void RemoveEnemy( vector<shared_ptr<Enemy>> & chart, string id )
{
for(auto it = chart.begin(); it != chart.end(); ++it)
{
if (id == it->GetEnemyID() )
{
chart.erase(it);
return;
}
}
Or, using the kind of code I suggested earlier:
void RemoveEnemy( vector<shared_ptr<Enemy>> & chart, string id )
{
auto iter = std::find_if(chart.begin(), chart.end(),
[&](auto &enemy){ return (enemy->GetEnemyID() == id); }
);
if (iter != chart.end())
chart.erase(iter);
}

The problem with your code is that erase() invalidates the iterator. You must use it = chart.erase(it).

I like mine which will remove aliens at high speed and without any care for the ordering of the other items. Removal with prejudice!
Note: remove_if is most often used with erase and it will preserve the order of the remaining elements. However, partition does not care about the ordering of elements and is much faster.
partition-test.cpp:
make partition-test && echo 1 alien 9 alien 2 8 alien 4 7 alien 5 3 | ./partition-test
#include <algorithm>
#include <iostream>
#include <iterator>
#include <memory>
#include <string>
#include <vector>
using namespace std;
template <typename T>
ostream &operator<<(ostream &os, const vector<T> &container) {
bool comma = false;
for (const auto &x : container) {
if (comma)
os << ", ";
os << *x;
comma = true;
}
return os;
}
int main() {
vector<shared_ptr<string>> iv;
auto x = make_shared<string>();
while (cin >> *x) {
iv.push_back(x);
x = make_shared<string>();
}
cout << iv << '\n';
iv.erase(partition(begin(iv), end(iv),
[](const auto &x) { return *x != "alien"s; }),
end(iv));
cout << iv << '\n';
return 0;
}

Output over unique elements of `std::multiset` and their frequency using std:: algorithm in C++ (no loops)

I have the following multiset in C++:
template<class T>
class CompareWords {
public:
bool operator()(T s1, T s2)
{
if (s1.length() == s2.length())
{
return ( s1 < s2 );
}
else return ( s1.length() < s2.length() );
}
};
typedef multiset<string, CompareWords<string>> mySet;
typedef std::multiset<string,CompareWords<string>>::iterator mySetItr;
mySet mWords;
I want to print each unique element of type std::string in the set once and next to the element I want to print how many time it appears in the list (frequency), as you can see the functor "CompareWord" keeps the set sorted.
A solution is proposed here, but its not what I need, because I am looking for a solution without using (while,for,do while).
I know that I can use this:
//gives a pointer to the first and last range or repeated element "word"
auto p = mWords.equal_range(word);
// compute the distance between the iterators that bound the range AKA frequency
int count = static_cast<int>(std::distance(p.first, p.second));
but I can't quite come up with a solution without loops?

Unlike the other solutions, this iterates over the list exactly once. This is important, as iterating over a structure like std::multimap is reasonably high overhead (the nodes are distinct allocations).
There are no explicit loops, but the tail-end recursion will be optimized down to a loop, and I call an algorithm that will run a loop.
template<class Iterator, class Clumps, class Compare>
void produce_clumps( Iterator begin, Iterator end, Clumps&& clumps, Compare&& compare) {
if (begin==end) return; // do nothing for nothing
typedef decltype(*begin) value_type_ref;
// We know runs are at least 1 long, so don't bother comparing the first time.
// Generally, advancing will have a cost similar to comparing. If comparing is much
// more expensive than advancing, then this is sub optimal:
std::size_t count = 1;
Iterator run_end = std::find_if(
std::next(begin), end,
[&]( value_type_ref v ){
if (!compare(*begin, v)) {
++count;
return false;
}
return true;
}
);
// call our clumps callback:
clumps( begin, run_end, count );
// tail end recurse:
return produce_clumps( std::move(run_end), std::move(end), std::forward<Clumps>(clumps), std::forward<Compare>(compare) );
}
The above is a relatively generic algorithm. Here is its use:
int main() {
typedef std::multiset<std::string> mySet;
typedef std::multiset<std::string>::iterator mySetItr;
mySet mWords { "A", "A", "B" };
produce_clumps( mWords.begin(), mWords.end(),
[]( mySetItr run_start, mySetItr /* run_end -- unused */, std::size_t count )
{
std::cout << "Word [" << *run_start << "] occurs " << count << " times\n";
},
CompareWords<std::string>{}
);
}
live example
The iterators must refer to a sorted sequence (with regards to the Comparator), then the clumps will be passed to the 3rd argument together with their length.
Every element in the multiset will be visited exactly once with the above algorithm (as a right-hand side argument to your comparison function). Every start of a clump will be visited (length of clump) additional times as a left-hand side argument (including clumps of length 1). There will be exactly N iterator increments performed, and no more than N+C+1 iterator comparisons (N=number of elements, C=number of clumps).

#include <iostream>
#include <algorithm>
#include <set>
#include <iterator>
#include <string>
int main()
{
typedef std::multiset<std::string> mySet;
typedef std::multiset<std::string>::iterator mySetItr;
mySet mWords;
mWords.insert("A");
mWords.insert("A");
mWords.insert("B");
mySetItr it = std::begin(mWords), itend = std::end(mWords);
std::for_each<mySetItr&>(it, itend, [&mWords, &it] (const std::string& word)
{
auto p = mWords.equal_range(word);
int count = static_cast<int>(std::distance(p.first, p.second));
std::cout << word << " " << count << std::endl;
std::advance(it, count - 1);
});
}
Outputs:
A 2
B 1
Live demo link.

Following does the job without explicit loop using recursion:
void print_rec(const mySet& set, mySetItr it)
{
if (it == set.end()) {
return;
}
const auto& word = *it;
auto next = std::find_if(it, set.end(),
[&word](const std::string& s) {
return s != word;
});
std::cout << word << " appears " << std::distance(it, next) << std::endl;
print_rec(set, next);
}
void print(const mySet& set)
{
print_rec(set, set.begin());
}
Demo

How to use vector and struct?

I need to count letters from the string, sort them by count and cout results. For this purpose I'm trying to use vector and struct. Here is part of my code, but it's not working, because I don't know how to implement something:
#include <iostream>
#include <string>
#include <vector>
#include <algorithm>
using namespace std;
struct int_pair{
int key;
int value;
};
bool sort_by_value(int_pair left, int_pair right){
return left.value < right.value;
}
int main() {
string characters = "aasa asdfs dfh f ukjyhkh k wse f sdf sdfsdf";
vector<int_pair> most_frequent;
for (string::size_type i = 0; i <= characters.length(); i++) {
int int_char = (int)characters[i];
most_frequent[int_char]++; <-- I want to do something like this, but it's not working
}
sort(most_frequent.begin(), most_frequent.end(), sort_by_value);
for (vector<int_pair>::iterator it = most_frequent.begin(); it != most_frequent.end(); ++it) <-- is this call correct?
cout << " " << it->key << ":" << it->value << endl;
return 0;
}
At this code I have 2 parts that I don't know how to deal:
most_frequent[int_char]++; <-- I want to do something like this, but it's not working
and
for (vector<int_pair>::iterator it = most_frequent.begin(); it != most_frequent.end(); ++it) <-- is this call correct?
Maybe you can see any other mistakes and potential issues at this code.

I would use a std::map to determine the frequency of each letter, then copy that into a multimap while reversing the key and value to get them in order.
#include <iostream>
#include <map>
#include <algorithm>
template<class T, class U>
std::pair<U,T> flip_pair(const std::pair<T,U>& p) {
return std::make_pair(p.second,p.first);
}
int main(){
std::string characters = "zxcvopqiuweriuzxchajksdui";
std::map<char,int> freq;
std::multimap<int,char> rev_freq;
// Calculate the frequency of each letter.
for(char c: characters){
freq[c]++;
}
// Copy the results into a multimap with the key and value flipped
std::transform(std::begin(freq), std::end(freq),
std::inserter(rev_freq, rev_freq.begin()),
flip_pair<char,int>);
// Print out the results in order.
for(std::pair<int,char> p : rev_freq){
std::cout << p.first << ": " << p.second << std::endl;
}
};

This should do what you need:
most_frequent[int_char].key = int_char;
most_frequent[int_char].value++;
Yes, it sets the key many times, even though it doesn't need to.

When accessing the container with the key (vector is indexed with an integer, which is "the key" in your case), you don't have to store the key in the value field of the container again.
So you don't need your struct since you only need the value field and can can store the number of occurrences directly in the vector.
The idea is to fill the vector with 256 integers in the beginning, all initialized to zero. Then, use the vector index as your "key" (character code) to access the elements (number of occurrences).
This will result in a code similar to this:
// initialize with 256 entries, one for each character:
vector<int> counts(256);
for (string::size_type i = 0; i <= characters.length(); i++) {
// for each occurrence of a character, increase the value in the vector:
int int_char = (int)characters[i];
counts[int_char]++;
}
Once filling of the vector is done, you can find the maximum value (not only the value but also the key where it is stored) using the std::max_element algorithm:
vector<int>::iterator most_frequent =
std::max_element(counts.begin(), counts.end());
// getting the character (index within the container, "key"):
std::cout << (char)(most_frequent - counts.begin());
// the number of occurrences ("value"):
std::cout << (*most_frequent);
Here is your example with the changes (only printing the most frequent character, here it is the space so you don't see it): http://ideone.com/94GfZz
You can sort this vector, however, you will loose the key of course, since the elements will move and change their indices. There is a nice trick to process statistics like that: Use a reversed (multi)map (key, value reversed):
multimap<int,int> keyForOccurrence;
for (vector<int>::iterator i = counts.begin(); i != counts.end(); ++i) {
int occurrences = *i;
int character = i - counts.begin();
keyForOccurrence.insert(std::pair<int,int>(occurrences, character));
}
Updated code: http://ideone.com/Ub5rnL
The last thing you should now sort out by yourself is how to access and process the data within this map. The fancy thing about this reversed map is that it is now automatically sorted by occurrence, since maps are sorted by key.

I find more natural to use a std::map container to store each character occurrences. The character is map's key, its occurrence count is map's value.
It's easy to scan the source string and build this map using std::map::operator[], and ++ to increase the occurrence count.
Then, you can build a second map from the above map, with key and value inverted: so this map will be sorted by occurrences, and then you can print this second map.
Note that you have to use a std::multimap as this second map, since its keys (i.e. the occurrences) can be repeated.
Sample code follows (I tested it with VS2010 SP1/VC10):
#include <stddef.h> // for size_t
#include <algorithm> // for std::transform
#include <functional> // for std::greater
#include <iostream> // for std::cout
#include <iterator> // for std::inserter
#include <map> // for std::map, std::multimap
#include <ostream> // for std::endl
#include <string> // for std::string
#include <utility> // for std::pair
using namespace std;
int main()
{
string str = "aasa asdfs dfh f ukjyhkh k wse f sdf sdfsdf";
// Build the occurrences map (char -> occurrences)
map<char, size_t> freq;
for (size_t i = 0; i < str.length(); ++i)
freq[ str[i] ]++;
// Build a new map from previous map with inverted <key, value> pairs,
// so this new map will be sorted by old map's value (i.e. char's
// occurrences), which is new map's key.
// Use the std::greater comparator to sort in descending order.
multimap<size_t, char, greater<size_t>> sorted_freq;
transform(
freq.begin(), freq.end(), // source
inserter(sorted_freq, sorted_freq.begin()), // destination
[](const pair<char, size_t>& p) // invert key<->value
{
return pair<size_t, char>(p.second, p.first);
}
);
// Print results
for (auto it = sorted_freq.begin(); it != sorted_freq.end(); ++it)
cout << it->second << ": " << it->first << endl;
}
Output:
: 9
s: 7
f: 7
d: 5
a: 4
k: 3
h: 3
u: 1
w: 1
y: 1
j: 1
e: 1
If you don't want to print the space character occurrences, you can easily filter that out.
Note that using std::map/std::multimap will also scale up better than std::vector for non-ASCII characters, e.g. if you use Unicode UTF-32 (since Unicode characters are much more than just 256).

How to use STL to do a case insensitive binary search for a string

If I have a vector of strings how do I do a binary search for a certain string using a case-insensitive comparison? I can't think of any easy way to do this.

Provide a comparison function to std::sort, sort your container in lower case (use boost string algos to help),
Then do a binary string on the sorted vector, again provide a case insensitive comparison operation to do this.
Using lambda expression will really help
If you use find it doesn't have to be sorted first, however it is slow if you are going to doing frequent search and the set is quite large.
EDIT: here is the example
#include <boost/algorithm/string.hpp>
#include <algorithm>
::::
auto comp=[](const std::string& a, const std::string& b){
return boost::ilexicographical_compare
<std::string, std::string>(a,b);
});
std::sort(vs.begin(), vs.end(), comp);
std::binary_search(vs.begin(), vs.end(), value_to_search_for, comp);
The same comparison function would also work with std::find if you are not going to sort the list.
TESTED
http://en.cppreference.com/w/cpp/algorithm/find
http://en.cppreference.com/w/cpp/algorithm/binary_search
http://en.cppreference.com/w/cpp/algorithm/sort

You could use find from algorithm header to locate a particular value in a container, but I don't think it uses the binary search algorithm (there is no pre-requisite to sort the container before passing it to find). More details can be found here.
There is binary_search also available in algorithm, again more details here.

I think you need to write your own compare function, that will compare two strings in lower-case variant. Using this function you can sort vector, and then compare query-string through those comparator.

use find_if provinding a custom predicate:
find_if (myvector.begin(), myvector.end(), MyPredicate);
http://www.cplusplus.com/reference/algorithm/find_if/
Also see this for help on writing a reusable predicate:
Making map::find operation case insensitive

#include <vector>
#include <string>
#include <algorithm>
#include <iostream>
#include <strings.h> // strncasecmp()
inline int icompare(std::string const& a, std::string const& b) {
size_t a_len = a.size(), b_len = b.size();
size_t cmp_len = std::min(a_len, b_len);
// strncasecmp() is a non-standard function, use the one available for your platform.
if(int r = strncasecmp(a.data(), b.data(), cmp_len))
return r;
return (a_len > b_len) - (a_len < b_len);
}
struct LessNoCase {
bool operator()(std::string const& a, std::string const& b) const {
return icompare(a, b) < 0;
}
};
template<class Iterator, class T>
Iterator binary_search_caseless(Iterator beg, Iterator end, T const& value) {
Iterator i = std::lower_bound(beg, end, value, LessNoCase());
return i != end && !icompare(*i, value)
? i // found
: end // not found
;
}
int main() {
char const* strings[] = {
"abc",
"def",
"ghi"
};
std::vector<std::string> v(
strings + 0,
strings + sizeof strings / sizeof *strings
);
// prepare for binary search
std::sort(v.begin(), v.end(), LessNoCase());
// do the binary search
std::cout << "index of 'abc' is " << binary_search_caseless(v.begin(), v.end(), "ABC") - v.begin() << '\n';
std::cout << "index of 'ABC' is " << binary_search_caseless(v.begin(), v.end(), "ABC") - v.begin() << '\n';
std::cout << "index of 'DEF' is " << binary_search_caseless(v.begin(), v.end(), "DEF") - v.begin() << '\n';
std::cout << "index of 'xyz' is " << binary_search_caseless(v.begin(), v.end(), "xyz") - v.begin() << '\n';
}
Outputs:
./test
index of 'abc' is 0
index of 'ABC' is 0
index of 'DEF' is 1
index of 'xyz' is 3

std::find does not support a predicate parameter, so the correct algorithm you're looking for is std::find_if.
std::find_if( vec.begin(), vec.end(), InsensitiveCompare("search string") );
...where InsensitiveCompare is a functor that returns true for case-insensitive comparisons. For example:
struct InsensitiveCompare
{
std::string comp;
InsensitiveCompare( std::string const &s ) : comp(s) {}
bool operator() ( std::string const &test ) const
{
// return true here if test compares with comp.
}
}

If you only need to know if such an element exists, use std::binary_search. If you need to access that element and know it's position, use std::lower_bound.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

C++ empty and array index - c++

Related

In C++11, how to find and return all the item(s) in a vector of strings that start with a given string?

Is it possible to remove elements from a vector of shared_ptr?

Output over unique elements of `std::multiset` and their frequency using std:: algorithm in C++ (no loops)

How to use vector and struct?

How to use STL to do a case insensitive binary search for a string

Categories

Resources