How to replace/remove a character in a character buffer? - c++

I am trying to modify someone's code which uses this line:
out.write(&vecBuffer[0], x.length());
However, I want to modify the buffer beforehand so it removes any bad characters I don't want to be output. For example if the buffer is "Test%string" and I want to get rid of %, I want to change the buffer to "Test string" or "Teststring" whichever is easier.

std::replace will allow replacing one specific character with
another, e.g. '%' with ' '. Just call it normally:
std::replace( vecBuffer.begin(), vecBuffer.end(), '%', ' ' );
Replace the '%' with a predicate object, call replace_if,
and you can replace any character for which the predicate
object returns true. But always with the same character. For
more flexibility, there's std::transform, which you pass
a function which takes a char, and returns a char; it will
be called on each character in the buffer.
Alternatively, you can do something like:
vecBuffer.erase(
std::remove( vecBuffer.begin(), vecBuffer.end(), '%' ).
vecBuffer.end() );
To remove the characters. Here too, you can replace remove
with remove_if, and use a predicate, which may match many
different characters.

The simplest library you can use is probably the Boost String Algorithms library.
boost::replace_all(buffer, "%", "");
will replace all occurrences of % by nothing, in place. You could specify " " as a replacement, or even "REPLACEMENT", as suits you.

std::string str("Test string");
std::replace_if(str.begin(), str.end(), boost::is_any_of(" "), '');
std::cout << str << '\n';

You do not need to use the boost library. The easiest way is to replace the % character with a space, using std::replace() from the <algorithm> header:
std::replace(vecBuffer.begin(), vecBuffer.end(), '%', ' ');
I assume that vecBuffer, as its name implies, is an std::vector. If it's actually a plain array (or pointer), then you would do:
std::replace(vecBuffer, vecBuffer + SIZE_OF_BUFFER, '%', ' ');
SIZE_OF_BUFFER should be the size of the array (or the amount of characters in the array you want to process, if you don't want to convert the whole buffer.)

Assuming you have a function
bool goodChar( char c );
That returns true for characters you are approved of and false otherwise,
then how about
void fixBuf( char* buf, unsigned int len ) {
unsigned int co = 0;
for ( unsigned int cb = 0 ; cb < len ; cb++ ) {
if goodChar( buf[cb] ) {
buf[co] = buf[cb];
co++;
}
}
}

Related

Allow user to pass a separator character by doubling it in C++

I have a C++ function that accepts strings in below format:
<WORD>: [VALUE]; <ANOTHER WORD>: [VALUE]; ...
This is the function:
std::wstring ExtractSubStringFromString(const std::wstring String, const std::wstring SubString) {
std::wstring S = std::wstring(String), SS = std::wstring(SubString), NS;
size_t ColonCount = NULL, SeparatorCount = NULL; WCHAR Separator = L';';
ColonCount = std::count(S.begin(), S.end(), L':');
SeparatorCount = std::count(S.begin(), S.end(), Separator);
if ((SS.find(Separator) != std::wstring::npos) || (SeparatorCount > ColonCount))
{
// SEPARATOR NEED TO BE ESCAPED, BUT DON'T KNOW TO DO THIS.
}
if (S.find(SS) != std::wstring::npos)
{
NS = S.substr(S.find(SS) + SS.length() + 1);
if (NS.find(Separator) != std::wstring::npos) { NS = NS.substr(NULL, NS.find(Separator)); }
if (NS[NS.length() - 1] == L']') { NS.pop_back(); }
return NS;
}
return L"";
}
Above function correctly outputs MANGO if I use it like:
ExtractSubStringFromString(L"[VALUE: MANGO; DATA: NOTHING]", L"VALUE")
However, if I have two escape separators in following string, I tried doubling like ;;, but I am still getting MANGO instead ;MANGO;:
ExtractSubStringFromString(L"[VALUE: ;;MANGO;;; DATA: NOTHING]", L"VALUE")
Here, value assigner is colon and separator is semicolon. I want to allow users to pass colons and semicolons to my function by doubling extra ones. Just like we escape double quotes, single quotes and many others in many scripting languages and programming languages, also in parameters in many commands of programs.
I thought hard but couldn't even think a way to do it. Can anyone please help me on this situation?
Thanks in advance.
You should search in the string for ;; and replace it with either a temporary filler char or string which can later be referenced and replaced with the value.
So basically:
1) Search through the string and replace all instances of ;; with \tempFill- It would be best to pick a combination of characters that would be highly unlikely to be in the original string.
2) Parse the string
3) Replace all instances of \tempFill with ;
Note: It would be wise to run an assert on your string to ensure that your \tempFill (or whatever you choose as the filler) is not in the original string to prevent an bug/fault/error. You could use a character such as a \n and make sure there are non in the original string.
Disclaimer:
I can almost guarantee there are cleaner and more efficient ways to do this but this is the simplest way to do it.
First as the substring does not need to be splitted I assume that it does not need to b pre-processed to filter escaped separators.
Then on the main string, the simplest way IMHO is to filter the escaped separators when you search them in the string. Pseudo code (assuming the enclosing [] have been removed):
last_index = begin_of_string
index_of_current_substring = begin_of_string
loop: search a separator starting at last index - if not found exit loop
ok: found one at ix
if char at ix+1 is a separator (meaning with have an escaped separator
remove character at ix from string by copying all characters after it one step to the left
last_index = ix+1
continue loop
else this is a true separator
search a column in [ index_of_current_substring, ix [
if not found: error incorrect string
say found at c
compare key_string with string[index_of_current_substring, c [
if equal - ok we found the key
value is string[ c+2 (skip a space after the colum), ix [
return value - search is finished
else - it is not our key, just continue searching
index_of_current_substring = ix+1
last_index = index_of_current_substring
continue loop
It should now be easy to convert that to C++

Remove spaces from string before period and comma

I could have a string like:
During this time , Bond meets a stunning IRS agent , whom he seduces .
I need to remove the extra spaces before the comma and before the period in my whole string. I tried throwing this into a char vector and only not push_back if the current char was " " and the following char was a "." or "," but it did not work. I know there is a simple way to do it maybe using trim(), find(), or erase() or some kind of regex but I am not the most familiar with regex.
A solution could be (using regex library):
std::string fix_string(const std::string& str) {
static const std::regex rgx_pattern("\\s+(?=[\\.,])");
std::string rtn;
rtn.reserve(str.size());
std::regex_replace(std::back_insert_iterator<std::string>(rtn),
str.cbegin(),
str.cend(),
rgx_pattern,
"");
return rtn;
}
This function takes in input a string and "fixes the spaces problem".
Here a demo
On a loop search for string " ," and if you find one replace that to ",":
std::string str = "...";
while( true ) {
auto pos = str.find( " ," );
if( pos == std::string::npos )
break;
str.replace( pos, 2, "," );
}
Do the same for " .". If you need to process different space symbols like tab use regex and proper group.
I don't know how to use regex for C++, also not sure if C++ supports PCRE regex, anyway I post this answer for the regex (I could delete it if it doesn't work for C++).
You can use this regex:
\s+(?=[,.])
Regex demo
First, there is no need to use a vector of char: you could very well do the same by using an std::string.
Then, your approach can't work because your copy is independent of the position of the space. Unfortunately you have to remove only spaces around the punctuation, and not those between words.
Modifying your code slightly you could delay copy of spaces waiting to the value of the first non-space: if it's not a punctuation you'd copy a space before the character, otherwise you just copy the non-space char (thus getting rid of spaces.
Similarly, once you've copied a punctuation just loop and ignore the following spaces until the first non-space char.
I could have written code. It would have been shorter. But i prefer letting you finish your homework with full understanding of the approach.

Remove characters from std::string from "(" to ")" with erase ?

I want to remove the substring of my string , it looks something like this :
At(Robot,Room3)
or
SwitchOn(Room2)
or
SwitchOff(Room1)
How can I remove all the characters from the left bracket ( to the right bracket ) , when I don't know their indexes ?
If you know the string matches the pattern then you can do:
std::string str = "At(Robot,Room3)";
str.erase( str.begin() + str.find_first_of("("),
str.begin() + str.find_last_of(")"));
or if you want to be safer
auto begin = str.find_first_of("(");
auto end = str.find_last_of(")");
if (std::string::npos!=begin && std::string::npos!=end && begin <= end)
str.erase(begin, end-begin);
else
report error...
You can also use the standard library <regex>.
std::string str = "At(Robot,Room3)";
str = std::regex_replace(str, std::regex("([^(]*)\\([^)]*\\)(.*)"), "$1$2");
If your compiler and standard library is new enough, then you could use std::regex_replace.
Otherwise, you search for the first '(', do a reverse search for the last ')', and use std::string::erase to remove everything in between. Or if there can be nothing after the closing parenthesis then find the first and use std::string::substr to extract the string you want to keep.
If the trouble you have is actually finding the parentheses the use std::string::find and/or std::string::rfind.
You have to search for the first '(' then erase after until 'str.length() - 1' (assuming your second bracket is always at the end)
A simple and safe and efficient solution:
std::string str = "At(Robot,Room3)";
size_t const open = str.find('(');
assert(open != std::string::npos && "Could not find opening parenthesis");
size_t const close = std.find(')', open);
assert(open != std::string::npos && "Could not find closing parenthesis");
str.erase(str.begin() + open, str.begin() + close);
Never parse a character more than once, beware of ill-formed inputs.

C++ string replace doesn't work as expected

small question about C++ replace function. I'm parsing every line of text input line by line. Example of the text file:
SF27_34KJ
EEE_30888
KPD324222
4230_333
And I need to remove all the underscores on every line and replace it with a comma. When I try something like this:
mystring.replace(mystring.begin(), mystring.end(), '_', ',');
on every line - instead of "SF27,34KJ" I get 95x "," char. What could be wrong?
Use std::replace():
std::replace(mystring.begin(), mystring.end(), '_', ',');
basic_string::replace doesn't do what you think it does.
basic_string::replace(it_a, it_e, ... ) replaces all of the characters between it_a and it_e with whatever you specify, not just those that match something.
There are a hundred ways to do what you're trying to do, but the simplest is probably to use the std::replace from <algorithm>, which does do what you want:
std::replace(mystring.begin(), mystring.end(), '_', ',');
Another method is to use std::transform in conjunction with a functor. This has an advantage over std::replace in that you can perform multiple substitutions in a single pass.
Here is a C++03 functor that would do it:
struct ReplChars
{
char operator()(char c) const
{
if( c == '_' )
return ',';
if( c == '*' )
return '.';
return c;
}
};
...and the use of it:
std::transform(mystring.begin(), mystring.end(), mystring.begin(), ReplChars());
In C++11, this can be reduced by using a lambda instead of the functor:
std::transform(mystring.begin(), mystring.end(), mystring.begin(), [](char c)->char
{
if( c == '_' )
return ',';
if( c == '*' )
return '.';
return c;
});
Looking here, there is no replace method which takes two iterators and then two characters. Considering the ascii value of '_' is 95, I'm guessing you're hitting this one instead:
string& replace ( iterator i1, iterator i2, size_t n2, char c );
So instead of replacing all instances of '_' with ',', instead you're replacing the string from begin() to end() with 95 ','s.
See here for how to replace occurrances in a string.
This isn't exactly how replace works. Check out the api http://www.cplusplus.com/reference/string/string/replace/, you give iterators for the beginning and end along with a string to copy in, but the other argument is the maximum length of that section.
To get the functionality that you're going for, try finding a substring and replacing it (two calls). That's detailed here: Replace part of a string with another string .

Can scanf/sscanf deal with escaped characters?

int main()
{
char* a = " 'Fools\' day' ";
char* b[64];
sscanf(a, " '%[^']s ", b);
printf ("%s", b);
}
--> puts "Fools" in b
Obviously, I want to have "Fools' day" in b. Can I tell sscanf() not to consider escaped apostrophes as the end of the character sequence?
Thanks!
No. Those functions just read plain old characters. They don't interpret the contents according to any escaping rules because there's nothing to escape from — quotation marks, apostrophes, and backslashes aren't special in the input string.
You'll have to use something else to parse your string. You can write a little state machine to read the string one character at a time, keeping track of whether the previous character was a backslash. (Don't just scan to the next apostrophe and then look one character backward; if you're allowed to escape backslashes as well as apostrophes, then you could end up re-scanning all the way back to the start of the string to see whether you have an odd or even number of escape characters. Always parse strings forward, not backward.)
Replace
char* a = " 'Fools\' day' ";
with
char* a = " 'Fools' day' ";
The ' character isn't special inside a C string (although it is special within a single char). So there is not need to escape it.
Also, if all you want is "Fools' day", why put the extra 's at the start and end? Maybe you are confusing C strings with those in some other language?
Edit:
As Rob Kennedy's comment says, I was assuming you are supplying the string yourself. Otherwise, see Rob's answer.
Why on earth would you write such a thing, instead of using std::string? Since your question is tagged C++.
int main(int argc, char* argv[])
{
std::string a = " 'Fools' day' ";
std::string b(a.begin() + 2, std::find(a.begin() + 2, a.end(), ' '));
std::cout << b;
std::cin.get();
}
Edit: Oh wait a second, you want to read a string within a string? Just use escaped double quotes, e.g.
int main(int argc, char* argv[]) {
std::string a = " \"Fool's day\" ";
auto it = std::find(a.begin(), a.end(), '"');
std::string b(it, std::find(it, a.end(), '"');
std::cout << b;
}
If the user put the string in, they won't have to escape single quotes, although they would have to escape double quotes, and you'd have to make your own system for that.