Vector's of unsigned char iterators not working

Vector's of unsigned char iterators not working - c++

I wanna to cut CRLF at end of the vector, but my code is not working (at first loop of while - equal is calling and returns false). In debug mode "i" == 0 and have "ptr" value == "0x002e4cfe"
string testS = "\r\n\r\n\r\n<-3 CRLF Testing trim new lines 3 CRLF->\r\n\r\n\r\n";
vector<uint8> _data; _data.clear();
_data.insert(_data.end(), testS.begin(), testS.end());
vector<uint8>::iterator i = _data.end();
uint32 bytesToCut = 0;
while(i != _data.begin()) {
if(equal(i - 1, i, "\r\n")) {
bytesToCut += 2;
--i; if(i == _data.begin()) return; else --i;
} else {
if(bytesToCut) _data.erase(_data.end() - bytesToCut, _data.end());
return;
}
}
Thanks a lot for your answers. But i need version with iterators, because my code is used when i parsing chunked http transfering data, which is writed to vector and i need func, which would take a pointer to a vector and iterator defining the position to remove CRLF backwards. And all my problems, i think, apparently enclosed in iterators.

Your code is invalid at least due to setting incorrect range in algorithm std::equal
if(equal(i - 1, i, "\r\n")) {
In this expression you compare only one element of the vector pointed by iterator i - 1 with '\r'. You have to write something as
if(equal(i - 2, i, "\r\n")) {
If you need to remove pairs "\r\n" from the vector then I can suggest the following approach (I used my own variable names and included testing output):
std::string s = "\r\n\r\n\r\n<-3 CRLF Testing trim new lines 3 CRLF->\r\n\r\n\r\n";
std::vector<unsigned char> v( s.begin(), s.end() );
std::cout << v.size() << std::endl;
auto last = v.end();
auto prev = v.end();
while ( prev != v.begin() && *--prev == '\n' && prev != v.begin() && *--prev == '\r' )
{
last = prev;
}
v.erase( last, v.end() );
std::cout << v.size() << std::endl;

instead if re inventing th wheel you can the existing STL algo with something like:
std::string s;
s = s.substr(0, s.find_last_not_of(" \r\n"));

If you need to just trim '\r' & '\n' from the end then simple substr will do:
std::string str = "\r\n\r\n\r\nSome string\r\n\r\n\r\n";
size_t newLength = str.length();
while (str[newLength - 1] == '\r' || str[newLength - 1] == '\n') newLength--;
str = str.substr(0, newLength);
std::cout << str;
Don't sweat small stuff :)
Removing all '\r' and '\n' could be simple as (C++03):
#include <iostream>
#include <string>
#include <algorithm>
int main() {
std::string str = "\r\n\r\n\r\nSome string\r\n\r\n\r\n";
str.erase(std::remove(str.begin(), str.end(), '\r'), str.end());
str.erase(std::remove(str.begin(), str.end(), '\n'), str.end());
std::cout << str;
}
or:
bool isUnwantedChar(char c) {
return (c == '\r' || c == '\n');
}
int main() {
std::string str = "\r\n\r\n\r\nSome string\r\n\r\n\r\n";
str.erase(std::remove_if(str.begin(), str.end(), isUnwantedChar), str.end());
std::cout << str;
}

First of all, your vector initialization is ... non-optimal. All you needed to do is:
string testS = "\r\n\r\n\r\n<-3 CRLF Testing trim new lines 3 CRLF->\r\n\r\n\r\n";
vector<uint8> _data(testS.begin(), testS.end());
Second, if you wanted to remove the \r and \n characters, you could have done it in the string:
testS.erase(std::remove_if(testS.begin(), testS.end(), [](char c)
{
return c == '\r' || c == '\n';
}), testS.end());
If you wanted to do it in the vector, it is the same basic process:
_data.erase(std::remove_if(_data.begin(), _data.end(), [](uint8 ui)
{
return ui == static_cast<uint8>('\r') || ui == static_cast<uint8>('\n');
}), _data.end());
Your problem is likely due to the usage of invalidated iterators in your loop (that has several other logical issues, but since it shouldn't exist anyway, I won't touch on) that removes elements 1-by-1.
If you wanted to remove the items just from the end of the string/vector, it would be slightly different, but still the same basic pattern:
int start = testS.find_first_not_of("\r\n", 0); // finds the first non-\r\n character in the string
int end = testS.find_first_of("\r\n", start); // find the first \r\n character after real characters
// assuming neither start nor end are equal to std::string::npos - this should be checked
testS.erase(testS.begin() + end, testS.end()); // erase the `\r\n`s at the end of the string.
or alternatively (if \r\n can be in the middle of the string as well):
std::string::reverse_iterator rit = std::find_if_not(testS.rbegin(), testS.rend(), [](char c)
{
return c == '\r' || c == '\n';
});
testS.erase(rit.base(), testS.end());

Related

What is the most efficient way to replace \\n with \n

I ask this because I am using SFML strings. sf::String does not insert a new line in the presence of a \n.
I can't seem to figure out a way without using 3/4 STL algorithms.
std::replace_if(str.begin(), str.end(), [](const char&c](return c == '\\n'), '\n'});
does not work. The string remains the same.
I have also tried replacing the \\ occurrence with a temporary, say ~. This works, but when I go to replace the ~ with \, then it adds a \\ instead of a \
I have produced a solution by manually replacing and deleting duplicates after \n insertion :
for (auto it = str.begin(); it != str.end(); ++it) {
if (*it == '\\') {
if (it + 1 != str.end()) {
if (*(it + 1) != 'n') continue;
*it = '\n';
str.erase(it + 1);
}
}
}

You might do:
str = std::regex_replace(str, std::regex(R"(\\n)"), "\n")
Demo

The problem is that '\\n' is not a single character but two characters. So it needs to be stored in a string "\\n". But now std::replace_if doesn't work because it operates on elements and the elements of a std::string are single characters.
You can write a new function to replace sub-strings within a string and call that instead. For example:
std::string& replace_all(std::string& s, std::string const& from, std::string const& to)
{
if(!from.empty())
for(std::string::size_type pos = 0; (pos = s.find(from, pos) + 1); pos += to.size())
s.replace(--pos, from.size(), to);
return s;
}
// ...
std::string s = "a\\nb\\nc";
std::cout << s << '\n';
replace_all(s, "\\n", "\n");
std::cout << s << '\n';

remove_if last character from a string

I would like to remove first and last brackets in passed by reference string. Unfortunately, I have difficulties with removing first and last elements conditionally.
I cannot understand why remove_if doesn't work as I expect with iterators.
Demo
#include <iostream>
#include <algorithm>
using namespace std;
void print_wo_brackets(string& str){
auto detect_bracket = [](char x){ return(')' == x || '(' == x);};
if (!str.empty())
{
str.erase(std::remove_if(str.begin(), str.begin() + 1, detect_bracket));
}
if (!str.empty())
{
str.erase(std::remove_if(str.end()-1, str.end(), detect_bracket));
}
}
int main()
{
string str = "abc)";
cout << str << endl;
print_wo_brackets(str);
cout << str << endl;
string str2 = "(abc";
cout << str2 << endl;
print_wo_brackets(str2);
cout << str2 << endl;
return 0;
}
Output
abc)
ac <- HERE I expect abc
(abc
abc

If remove_if returns end iterator then you will try to erase nonexistent element. You should use erase version for range in both places:
void print_wo_brackets(string& str){
auto detect_bracket = [](char x){ return(')' == x || '(' == x);};
if (!str.empty())
{
str.erase(std::remove_if(str.begin(), str.begin() + 1, detect_bracket), str.begin() + 1);
}
if (!str.empty())
{
str.erase(std::remove_if(str.end()-1, str.end(), detect_bracket), str.end());
}
}

The problem is here:
if (!str.empty())
{
str.erase(std::remove_if(str.begin(), str.begin() + 1, detect_bracket));
}
you erase unconditionally.
std::remove_if returns iterator to the beginning of range "for removal". If there are no elements for removal, it returns end of range (str.begin() + 1 in this case). So you remove begin+1 element, which is b.
To protect from this problem you shouldn't probably do something more like:
if (!str.empty())
{
auto it = std::remove_if(str.begin(), str.begin() + 1, detect_bracket);
if(it != str.begin() + 1)
str.erase(it);
}
I assume you simply want to check behavior of standard library and iterators, as otherwise check:
if(str[0] == '(' || str[0] == ')')
str.erase(0);
is much simpler.

Alternative:
#include <iostream>
#include <string>
std::string without_brackets(std::string str, char beg = '(', char end = ')') {
auto last = str.find_last_of(end);
auto first = str.find_first_of(beg);
if(last != std::string::npos) {
str.erase(str.begin()+last);
}
if(first != std::string::npos) {
str.erase(str.begin()+first);
}
return str;
}
using namespace std;
int main() {
cout << without_brackets("abc)") << endl
<< without_brackets("(abc") << endl
<< without_brackets("(abc)") << endl
<< without_brackets("abc") << endl;
return 0;
}
see: http://ideone.com/T2bZDe
result:
abc
abc
abc
abc

As stated in the comments by #PeteBecker, remove_if is not the right algorithm here. Since you only want to remove the first and last characters if they match, a much simpler approach is to test back() and front() against the two parentheses ( and ) (brackets would be [ and ])
void remove_surrounding(string& str, char left = '(', char right = ')')
{
if (!str.empty() && str.front() == left)
str.erase(str.begin());
if (!str.empty() && str.back() == right)
str.erase(str.end() - 1);
}
Live Example

All you need is this:
void print_wo_brackets(string& str){
str.erase(std::remove_if(str.begin(), str.end(),
[&](char &c) { return (c == ')' || c == '(') && (&c == str.data() || &c == (str.data() + str.size() - 1));}), str.end());
}
Live Demo
By stating:
str.erase(std::remove_if(str.end()-1, str.end(), detect_bracket));
You're evoking undefined behaviour.

Getting the words from a sentence and storing them in a vector of strings

Alright, guys ...
Here's my set that has all the letters. I'm defining a word as consisting of consecutive letters from the set.
const char LETTERS_ARR[] = {"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"};
const std::set<char> LETTERS_SET(LETTERS_ARR, LETTERS_ARR + sizeof(LETTERS_ARR)/sizeof(char));
I was hoping that this function would take in a string representing a sentence and return a vector of strings that are the individual words in the sentence.
std::vector<std::string> get_sntnc_wrds(std::string S) {
std::vector<std::string> retvec;
std::string::iterator it = S.begin();
while (it != S.end()) {
if (LETTERS_SET.count(*it) == 1) {
std::string str(1,*it);
int k(0);
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1) == 1))) {
str.push_back(*(it + (++k)));
}
retvec.push_back(str);
it += k;
}
else {
++it;
}
}
return retvec;
}
For instance, the following call should return a vector of the strings "Yo", "dawg", etc.
std::string mystring("Yo, dawg, I heard you life functions, so we put a function inside your function so you can derive while you derive.");
std::vector<std::string> mystringvec = get_sntnc_wrds(mystring);
But everything isn't going as planned. I tried running my code and it was putting the entire sentence into the first and only element of the vector. My function is very messy code and perhaps you can help me come up with a simpler version. I don't expect you to be able to trace my thought process in my pitiful attempt at writing that function.

Try this instead:
#include <vector>
#include <cctype>
#include <string>
#include <algorithm>
// true if the argument is whitespace, false otherwise
bool space(char c)
{
return isspace(c);
}
// false if the argument is whitespace, true otherwise
bool not_space(char c)
{
return !isspace(c);
}
vector<string> split(const string& str)
{
typedef string::const_iterator iter;
vector<string> ret;
iter i = str.begin();
while (i != str.end())
{
// ignore leading blanks
i = find_if(i, str.end(), not_space);
// find end of next word
iter j = find_if(i, str.end(), space);
// copy the characters in [i, j)
if (i != str.end())
ret.push_back(string(i, j));
i = j;
}
return ret;
}
The split function will return a vector of strings, each element containing one word.
This code is taken from the Accelerated C++ book, so it's not mine, but it works. There are other superb examples of using containers and algorithms for solving every-day problems in this book. I could even get a one-liner to show the contents of a file at the output console. Highly recommended.

It's just a bracketing issue, my advice is (almost) never put in more brackets than are necessary, it's only confuses things
while (it+k+1 != S.end() && LETTERS_SET.count(*(it+k+1)) == 1) {
Your code compares the character with 1 not the return value of count.
Also although count does return an integer in this context I would simplify further and treat the return as a boolean
while (it+k+1 != S.end() && LETTERS_SET.count(*(it+k+1))) {

You should use the string steam with std::copy like so:
#include <iostream>
#include <string>
#include <sstream>
#include <algorithm>
#include <iterator>
#include <vector>
int main() {
std::string sentence = "And I feel fine...";
std::istringstream iss(sentence);
std::vector<std::string> split;
std::copy(std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>(),
std::back_inserter(split));
// This is to print the vector
for(auto iter = split.begin();
iter != split.end();
++iter)
{
std::cout << *iter << "\n";
}
}

I would use another more simple approach based on member functions of class std::string. For example
const char LETTERS[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
std::string s( "This12 34is 56a78 test." );
std::vector<std::string> v;
for ( std::string::size_type first = s.find_first_of( LETTERS, 0 );
first != std::string::npos;
first = s.find_first_of( LETTERS, first ) )
{
std::string::size_type last = s.find_first_not_of( LETTERS, first );
v.push_back(
std::string( s, first, last == std::string::npos ? std::string::npos : last - first ) );
first = last;
}
for ( const std::string &s : v ) std::cout << s << ' ';
std::cout << std::endl;

Here you make 2 mistakes, I have correct in the following code.
First, it should be
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1)) == 1))
and, it should move to next by
it += (k+1);
and the code is
std::vector<std::string> get_sntnc_wrds(std::string S) {
std::vector<std::string> retvec;
std::string::iterator it = S.begin();
while (it != S.end()) {
if (LETTERS_SET.count(*it) == 1) {
std::string str(1,*it);
int k(0);
while (((it+k+1) != S.end()) && (LETTERS_SET.count(*(it+k+1)) == 1)) {
str.push_back(*(it + (++k)));
}
retvec.push_back(str);
it += (k+1);
}
else {
++it;
}
}
return retvec;
}
The output have been tested.

Loop quitting for no reason

I have a question regarding C++. This is my current function:
string clarifyWord(string str) {
//Remove all spaces before string
unsigned long i = 0;
int currentASCII = 0;
while (i < str.length()) {
currentASCII = int(str[i]);
if (currentASCII == 32) {
str.erase(i);
i++;
continue;
} else {
break;
}
}
//Remove all spaces after string
i = str.length();
while (i > -1) {
currentASCII = int(str[i]);
if (currentASCII == 32) {
str.erase(i);
i--;
continue;
} else {
break;
}
}
return str;
}
Just to get the basic and obvious things out of the way, I have #include <string> and using namespace std; so I do have access to the string functions.
The thing is though that the loop is quitting and sometimes skipping the second loop. I am passing in the str to be " Cheese " and it should remove all the spaces before the string and after the string.
In the main function, I am also assigning a variable to clarifyWord(str) where str is above. It doesn't seem to print that out either using cout << str;.
Is there something I am missing with printing out strings or looping with strings? Also ASCII code 32 is Space.

Okay so the erase function you are calling looks like this:
string& erase ( size_t pos = 0, size_t n = npos );
The n parameter is the number of items to delete. The npos means, delete everything up until the end of the string, so set the second parameter to 1.
str.erase(i,1)
[EDIT]
You could change the first loop to this:
while (str.length() > 0 && str[0] == ' ')
{
str.erase(0,1);
}
and the second loop to this:
while (str.length() > 0 && str[str.length() - 1] == ' ')
{
str.erase(str.length() - 1, 1);
}

In your second loop, you can't initialize i to str.length().
str[str.length()] is going to be after the end of your string, and so is unlikely to be a space (thus triggering the break out of the second loop).

You're using erase (modifying the string) while you're in a loop checking its size. This is a dangerous way of processing the string. As you return a new string, I would recommend you first to search for the first occurrence in the string of the non-space character, and then the last one, and then returning a substring. Something along the lines of (not tested):
size_t init = str.find_first_not_of(' ');
if (init == std::string::npos)
return "";
size_t fini = std.find_last_not_of(' ');
return str.substr(init, fini - init + 1);
You see, no loops, erases, etc.

unsigned long i ... while (i > -1) Well, that's not right, is it? How would you expect that to work? The compiler will in fact convert both operands to the same type: while (i > static_cast<unsigned long>(-1)). And that's just another way to write ULONG-MAX, i.e. while (i > ULONG_MAX). In other words, while(false).

You're using erase incorrectly. It'll erase from pos to npos.
i.e. string& erase ( size_t pos = 0, size_t n = npos );
See: http://www.cplusplus.com/reference/string/string/erase/
A better way to do this is to note the position of the first non space and where the spaces occur at the end of the string. Then use either substr or erase twice.
You also don't need to go to the trouble of doing this:
currentASCII = int(str[i]);
if (currentASCII == 32) {
Instead do this:
if (str[i] == ' ') {
Which I think you'll agree is a lot easier to read.
So, you can shorten it somewhat with something like: (not tested but it shouldn't be far
off)
string clarifyWord(string str) {
int start = 0, end = str.length();
while (str[start++] == ' ');
while (str[end--] == ' ');
return str.substr(start, end);
}

Efficient way to check if std::string has only spaces

I was just talking with a friend about what would be the most efficient way to check if a std::string has only spaces. He needs to do this on an embedded project he is working on and apparently this kind of optimization matters to him.
I've came up with the following code, it uses strtok().
bool has_only_spaces(std::string& str)
{
char* token = strtok(const_cast<char*>(str.c_str()), " ");
while (token != NULL)
{
if (*token != ' ')
{
return true;
}
}
return false;
}
I'm looking for feedback on this code and more efficient ways to perform this task are also welcome.

if(str.find_first_not_of(' ') != std::string::npos)
{
// There's a non-space.
}

In C++11, the all_of algorithm can be employed:
// Check if s consists only of whitespaces
bool whiteSpacesOnly = std::all_of(s.begin(),s.end(),isspace);

Why so much work, so much typing?
bool has_only_spaces(const std::string& str) {
return str.find_first_not_of (' ') == str.npos;
}

Wouldn't it be easier to do:
bool has_only_spaces(const std::string &str)
{
for (std::string::const_iterator it = str.begin(); it != str.end(); ++it)
{
if (*it != ' ') return false;
}
return true;
}
This has the advantage of returning early as soon as a non-space character is found, so it will be marginally more efficient than solutions that examine the whole string.

To check if string has only whitespace in c++11:
bool is_whitespace(const std::string& s) {
return std::all_of(s.begin(), s.end(), isspace);
}
in pre-c++11:
bool is_whitespace(const std::string& s) {
for (std::string::const_iterator it = s.begin(); it != s.end(); ++it) {
if (!isspace(*it)) {
return false;
}
}
return true;
}

Here's one that only uses STL (Requires C++11)
inline bool isBlank(const std::string& s)
{
return std::all_of(s.cbegin(),s.cend(),[](char c) { return std::isspace(c); });
}
It relies on fact that if string is empty (begin = end) std::all_of also returns true
Here is a small test program: http://cpp.sh/2tx6

Using strtok like that is bad style! strtok modifies the buffer it tokenizes (it replaces the delimiter chars with \0).
Here's a non modifying version.
const char* p = str.c_str();
while(*p == ' ') ++p;
return *p != 0;
It can be optimized even further, if you iterate through it in machine word chunks. To be portable, you would also have to take alignment into consideration.

I do not approve of you const_casting above and using strtok.
A std::string can contain embedded nulls but let's assume it will be all ASCII 32 characters before you hit the NULL terminator.
One way you can approach this is with a simple loop, and I will assume const char *.
bool all_spaces( const char * v )
{
for ( ; *v; ++v )
{
if( *v != ' ' )
return false;
}
return true;
}
For larger strings, you can check word-at-a-time until you reach the last word, and then assume the 32-bit word (say) will be 0x20202020 which may be faster.

Something like:
return std::find_if(
str.begin(), str.end(),
std::bind2nd( std::not_equal_to<char>(), ' ' ) )
== str.end();
If you're interested in white space, and not just the space character,
then the best thing to do is to define a predicate, and use it:
struct IsNotSpace
{
bool operator()( char ch ) const
{
return ! ::is_space( static_cast<unsigned char>( ch ) );
}
};
If you're doing any text processing at all, a collection of such simple
predicates will be invaluable (and they're easy to generate
automatically from the list of functions in <ctype.h>).

it's highly unlikely you'll beat a compiler optimized naive algorithm for this, e.g.
string::iterator it(str.begin()), end(str.end())
for(; it != end && *it == ' '; ++it);
return it == end;
EDIT: Actually - there is a quicker way (depending on size of string and memory available)..
std::string ns(str.size(), ' ');
return ns == str;
EDIT: actually above is not quick.. it's daft... stick with the naive implementation, the optimizer will be all over that...
EDIT AGAIN: dammit, I guess it's better to look at the functions in std::string
return str.find_first_not_of(' ') == string::npos;

I had a similar problem in a programming assignment, and here is one other solution I came up with after reviewing others. here I simply create a new sentence without the new spaces. If there are double spaces I simply overlook them.
string sentence;
string newsent; //reconstruct new sentence
string dbl = " ";
getline(cin, sentence);
int len = sentence.length();
for(int i = 0; i < len; i++){
//if there are multiple whitespaces, this loop will iterate until there are none, then go back one.
if (isspace(sentence[i]) && isspace(sentence[i+1])) {do{
i++;
}while (isspace(sentence[i])); i--;} //here, you have to dial back one to maintain at least one space.
newsent +=sentence[i];
}
cout << newsent << "\n";

Hm...I'd do this:
for (auto i = str.begin(); i != str.end() ++i)
if (!isspace(i))
return false;
Pseudo-code, isspace is located in cctype for C++.
Edit: Thanks to James for pointing out that isspace has undefined behavior on signed chars.

If you are using CString, you can do
CString myString = " "; // All whitespace
if(myString.Trim().IsEmpty())
{
// string is all whitespace
}
This has the benefit of trimming all newline, space and tab characters.

We Keep Coding

c++ django amazon-web-services regex python-2.7 google-cloud-platform list unit-testing opengl ember.js

Vector's of unsigned char iterators not working - c++

instead if re inventing th wheel you can the existing STL algo with something like: std::string s; s = s.substr(0, s.find_last_not_of(" \r\n"));

Related

What is the most efficient way to replace \\n with \n

remove_if last character from a string

Getting the words from a sentence and storing them in a vector of strings

Loop quitting for no reason

Efficient way to check if std::string has only spaces

Categories

Resources