Remove spaces from string not taking effect - c++

I'm trying to remove all characters and spaces except letters. But the "erase spaces" part doesn't take effect, it will only take effect if I comment out the remove characters part.
for (int i = 0; i < s.size(); i++)
{
if (!(s[i] >= 'a' && s[i] <= 'z' || s[i] >= 'A' && s[i] <= 'Z'))
{
s[i] = '\0';
}
}
s.erase(remove(s.begin(), s.end(), ' '), s.end());

You're replacing all the non-alphabetic characters with NULs, then removing all the spaces. Since NULs are not spaces, this latter step does nothing. If you change the assignment in the loop to
s[i] = ' ';
you would instead replace them with spaces, which would then be removed by the eraser(remove
If you want to make the code more readable, you could replace the complex if with
if (!isalpha(s[i]))
or you could even replace the whole thing with
s.erase(remove_if(s.begin(), s.end(), [](char ch){ return !isalpha(ch); });

So you replaced the characters you don't want with '\0'.
Then you removed all ' ' characters.
That last stage presumably should involve '\0'…

For the benefit of future readers: in C++20, we have unified erasure, so we can simply use
std::erase_if(s, [](unsigned char c) { return !std::isalpha(ch); });
(See Do I need to cast to unsigned char before calling toupper(), tolower(), et al.?
for why unsigned char should be used)

Related

Remove out excess spaces from string in C++

I have written program for removing excess spaces from string.
#include <iostream>
#include <string>
void RemoveExcessSpaces(std::string &s) {
for (int i = 0; i < s.length(); i++) {
while (s[i] == ' ')s.erase(s.begin() + i);
while (s[i] != ' ' && i < s.length())i++;
}
if (s[s.length() - 1] == ' ')s.pop_back();
}
int main() {
std::string s(" this is string ");
RemoveExcessSpaces(s);
std::cout << "\"" << s << "\"";
return 0;
}
One thing is not clear to me. This while (s[i] == ' ')s.erase(s.begin() + i); should remove every space in string, so the output would be thisisstring, but I got correct output which is this is string.
Could you explain me why program didn't remove one space between this and is and why I got the correct output?
Note: I cannot use auxiliary strings.
That is because when your last while loop finds the space between your characters (this is) control pass to increment part of your for loop which will increase the value of int i then it will point to next character of given string that is i(this is string) that's why there is space between (this is).
Your second while loop will break when s[i]==' '. But then your for loop will increment i and s[i] for that i will be skipped. This will happen for every first space character after each word.

Replacing a substring with a space character

I am given a string and I have to remove a substring from it. Namely WUB, and replace it with a space character.
There are 2 WUB's between ÁRE' and 'THE'. SO the first condition in if statement is for not printing two blank spaces but on executing the code two blank spaces are being printed.
Input: WUBWEWUBAREWUBWUBTHEWUBCHAMPIONSWUBMYWUBFRIENDWUB
Output: WE ARE THE CHAMPIONS MY FRIEND
Here is my code so far:
#include <iostream>
using namespace std;
int main()
{
const string check = "WUB";
string s, p;
int ct = 0;
cin >> s;
for (int i = 0; i < s.size(); i++)
{
if (s[i] == 'W' && s[i+1] == 'U' && s[i+2] == 'B')
{
i += 2;
if (p[ct] == '32' || p.empty())
{
continue;
}
else
{
p += ' ';
ct++;
}
}
else
{
p += s[i];
ct++;
}
}
cout << p;
return 0;
}
Why is the first if statement never executed?
2 things are going to break your code:
you are doing a for loop like this int i=0;i<s.size() but reading (s[i]=='W' && s[i+1]=='U' && s[i+2]=='B')
and here: if(p[ct]=='32') you mean for sure if(p[ct]==32) or if(p[ct]==' ')
This condition
if(p[ct]=='32')
should read either
if(p[ct]==32)
or
if(p[ct]==' ')
that is, compare to the numeric value of the space character or to the space character itself.
Additionally, when your i grows close to the string's length, the subexpressions s[i+1] and s[i+2] may reach non-exiting characters of the string. You should continue looping with a i<s.length()-2 condition.
EDIT
For a full solution you need to fully understand the problem you want to solve. The problem statement is a bit vague:
remove a substring ("WUB") from (a given string). And put a space inplace of it if required.
You considered the last condition, but not deeply enough. What does it mean, 'if required'? Replacement is not required if the resulting string is empty or you appended a space to it already (when you encounter a second of further consecutive WUB). It is also not necessary if you are at WUB, but there is nothing more following it - except possibly another WUBs...
So, when you find a "WUB" substring it is too early to decide if a space is needed. You know you need a space when you find a non-WUB text following some WUB (or WUBs) and there was some text before those WUB(s).
There are actually three bugs here, so it's probably worth to conclude them in one answer:
The first condition:
if (s[i] == 'W' && s[i+1] == 'U' && s[i+2] == 'B')
is out of bounds for the last two characters. One fix would be to check the length first:
if(i < s.length() - 2 && s[i] == 'W' && s[i+1] == 'U' && s[i+2] == 'B')
There's a multicharacter-literal in
if (p[ct] == '32' || p.empty())
Use ' ' or 32 or std::isspace instead. IMO the last one is the best.
In the same condition
p[ct] == '32'
is always out of bounds: ct is equal to p.length(). (Credits to Some programmer dude, who mentioned this in the comments!) The variable ct is also redundant, since std::string knows it's length. I suggest to use std::string::back() to access the last character and reorder the condition as so:
if (p.empty() || std::isspace(p.back()))
The algorithm to this program is on the right track.
However, there is a few issues..
The for loop goes out of index. A way to solve this issue is substracting the size -3. Something like this.
for (int i=0; i<s.size()-3; i++) {
}
I do not suggest using other variables as counters like ct. In this case ct can reach an index out of bound error by using p[ct] inside the for loop.
Creating a string and using append() function will be a better solution. In this case, we iterate through each character in the string and if we find "WUB" then we append a " ". Otherwise, we append the character.
I highly recommend to write the first if() statement using substring() from C++.
This makes the code easier to read.
Substring creates and returns a new string that starts from a specific position to an ending position. Here is the syntax
syntax: substr(startingIndex, endingIndex);
endingIndex is exclusive
#include <string>
#include <iostream>
int main() {
string s, p;
cin >> s;
for(int i=0;i<s.size()-3;i++) {
if (s.substr(i, i+3) == "WUB") {
p.append(" ");
} else {
p.append(s.substr(i,i+1));
i++;
continue;
}
i+=3;
}
}

The way to strip newline characters

Let's say we have a long string with multiple newline characters:
char const* some_text = "part1\n\npart2\npart3";
Now the task is to replace all '\n' characters with spaces if it appears only once between text parts, and at the same time leave all '\n' characters if it appears more than once. In other words:
"123\n456" => "123 456"
"123\n\n456" => "123\n\n456"
"123\n\n456\n789" => "123\n\n456 789"
What is the best way to do this?
The following regular expression detects single occurrences of newlines:
([^\n]|^)\n([^\n]|$)
|-------|
no newline before
(either other character or beginning of string)
|--|
newline
|--------|
no newline after
(either other character or end of string)
You can use that regular expression in std::regex_replace in order to replace those single newlines by spaces (and keeping the matched character before and after the newline by adding $1 and $2):
std::string testString("\n123\n\n456\n789");
std::regex e("([^\n]|^)\n([^\n]|$)");
std::cout << std::regex_replace(testString, e, "$1 $2") << std::endl;
Since it was tagged as C++, I'll treat it as such. Obviously this could be solved with a regex but it's equally trivial enough (as described) not to require one.
std::string s = "your\n\nstring\nhere\n";
size_t n = -1, len = s.length();
while ((n = s.find('\n', n+1)) != std::string::npos)
if ((n == 0 && s[n+1] != '\n') || (n == len && s[n-1] != '\n') ||
(n != 0 && n != len && s[n-1] != '\n' && s[n+1] != '\n'))
s[n] = ' ';
This function may works for your case, just manually check and replace single \n with space. There may have better option like regex_replace.
void rep(char ch[])
{
int cnt = 0;
int i;
for(i=0; ch[i]!='\0'; i++)
{
if(ch[i]=='\n')
cnt++;
else if(cnt==1)
{
ch[i-1]=' ';
cnt=0;
}
else
cnt=0;
}
if(cnt==1)
ch[i-1]=' ';
}

Why does assertion failed if both values are the same?

string removeNonAlphas(string original)
{
for(int i = 0; i < original.size(); ++i){
if(!(original[i] > 64 && original[i] < 91) &&
!(original[i] > 96 && original[i] < 124)){
original[i] = original[i] - original[i];
}
}
return original;
}
//test1.cpp
string test = "abc abc";
cout << removeNonAlphas(test) << endl; // output = "abcabc"
assert(removeNonAlphas(test) == "abcabc"); // assertion failed
//Why does assertion fail above? removeNonAlphas result("abcabc") is same as
//rhs "abcabc"
original[i] = original[i] - original[i];
What this makes is that it repaces the character with '\0' but does not remove it. Because of that the output is not "abcabc" but "abc\0abc". '\0' is non-printable so you won't see it in the output but it is present when you compare it with ==.
Instead of replacing charactes in a string, create a new string while iterating the old one:
string removeNonAlphas(string const& original)
{
std::string result;
for(char c : original)
if((c > 64 && c < 91) ||
(c > 96 && c < 124))
result.push_back(c);
return result;
}
Note: prefer using std::isalpha instead of hard-coded values.
Both values are NOT the same, but the difference is a non-printing character, so you can't see any difference with cout and your naked eye.
Try a proper tool, like a debugger, and you will see the extra \0 character present in the function result.
You're not actually erasing any characters from your string. You're just assigning them the value 0. It just looks like it works - which is just the worst. The '\0' is just a non-printable character, which is why it looks like it prints the same. The == will actually check every character, even the non-printable ones, so it'll catch what you can't see.
Thankfully, the string class makes it easy to erase characters by providing just such a member function:
original.erase(i, 1); // erase a single character starting at i
Now that alone isn't enough. You erase a character, and now i is "pointing" to the next element - but you won't check it. If we had "abc12abc", after erasing the 1, we'd skip the 2. So we need to change how we iterate:
for (std::string::iterator it = original.begin();
it != original.end();
/* nothing */)
{
// here's a better way to do checking
if (!(*it >= 'A' && *it <= 'Z') &&
!(*it >= 'a' && *it <= 'z'))
{
// erase(iterator ) will return the next iterator
it = original.erase(it);
}
else
{
++it;
}
}
That'll work. It's also very verbose. And error-prone. Which is why we have the erase-remove idiom:
original.erase(
std::remove_if(original.begin(),
original.end(),
[](char c) { return !std::isalpha(c); }),
original.end()
);
return original;

Why is my program not reading all of characters in the string?

I need to input a string, if the string is just a whole string and not with spaces, the codes is fine, if the input is a string with spaces, the string only copys the first set of strings and not the whole strings? I'm an noob, please help.
#include <stdio.h>
#include <string.h>
int main() {
char again = 0;
do {
char str[60], s[60];
int i, j = 0;
printf("Enter any string->");
scanf("%s", str);
printf("The string is->%s", str);
for (i = 0; i <= strlen(str); i++) {
if (str[i] == 'a' || str[i] == 'e' || str[i] == 'i' ||
str[i] == 'o' || str[i] == 'u' || str[i] == 'A' ||
str[i] == 'E' || str[i] == 'I' || str[i] == 'O' ||
str[i] == 'U') {
str[i] = ' ';
} else {
s[j++] = str[i];
}
}
s[j] = '\0';
printf("\nThe string without vowel is->%s", s);
NSLog(#"Do you want to enter another string to be edit? (y/n) ");
scanf("%s", &again);
} while (again != 'n');
}
Your code stops reading at a space because that's how scanf works with the %s format. It reads a sequence of non-whitespace characters.
If you're really using C++, then you'd be wise to switch to std::string and std::getline, which will read all input up to the end of the line. Your code doesn't appear to use any C++ features, though, so maybe you're really using C. In that case, consider fgets instead. It will read the whole line, too (up to a specified size, which generally corresponds to the size of your buffer).
This code is a mess.
C++ features like std::string are not used at all.
You're mixing printf/scanf and NSLog for no reason.
Modifying str in the if branch makes no sense, as it won't be read later.
You probably want to use i < strlen(str) instead of <=, or you'll copy that terminating zero character twice.
Your scanf("%s", &again); specifies to read a string, but you only have memory for a character, thus you probably end up writing into some random memory position.
While some of these points are more severe than others, I suggest fixing those issues and see what happens. If you experience unexpected output, then please do give your example input and output as well.
You are allocating an array of 60 chars long (str). You can't expect to read a lot into it. Here are a few tips:
Don't use such buffers, they are dangerous. The C++ library provides you std::string.
Never omit the curly braces {}.
There are an easier way to check for chars than this horribly long if. Hint: std::string::find_first_of