Can scanf/sscanf deal with escaped characters? - c++

int main()
{
char* a = " 'Fools\' day' ";
char* b[64];
sscanf(a, " '%[^']s ", b);
printf ("%s", b);
}
--> puts "Fools" in b
Obviously, I want to have "Fools' day" in b. Can I tell sscanf() not to consider escaped apostrophes as the end of the character sequence?
Thanks!

No. Those functions just read plain old characters. They don't interpret the contents according to any escaping rules because there's nothing to escape from — quotation marks, apostrophes, and backslashes aren't special in the input string.
You'll have to use something else to parse your string. You can write a little state machine to read the string one character at a time, keeping track of whether the previous character was a backslash. (Don't just scan to the next apostrophe and then look one character backward; if you're allowed to escape backslashes as well as apostrophes, then you could end up re-scanning all the way back to the start of the string to see whether you have an odd or even number of escape characters. Always parse strings forward, not backward.)

Replace
char* a = " 'Fools\' day' ";
with
char* a = " 'Fools' day' ";
The ' character isn't special inside a C string (although it is special within a single char). So there is not need to escape it.
Also, if all you want is "Fools' day", why put the extra 's at the start and end? Maybe you are confusing C strings with those in some other language?
Edit:
As Rob Kennedy's comment says, I was assuming you are supplying the string yourself. Otherwise, see Rob's answer.

Why on earth would you write such a thing, instead of using std::string? Since your question is tagged C++.
int main(int argc, char* argv[])
{
std::string a = " 'Fools' day' ";
std::string b(a.begin() + 2, std::find(a.begin() + 2, a.end(), ' '));
std::cout << b;
std::cin.get();
}
Edit: Oh wait a second, you want to read a string within a string? Just use escaped double quotes, e.g.
int main(int argc, char* argv[]) {
std::string a = " \"Fool's day\" ";
auto it = std::find(a.begin(), a.end(), '"');
std::string b(it, std::find(it, a.end(), '"');
std::cout << b;
}
If the user put the string in, they won't have to escape single quotes, although they would have to escape double quotes, and you'd have to make your own system for that.

Related

Using one cout command to print multiple strings with each string placed on a different (text editor) line

Take a look at the following example:
cout << "option 1:
\n option 2:
\n option 3";
I know,it's not the best way to output a string,but the question is why does this cause an error saying that a " character is missing?There is a single string that must go to stdout but it just consists of a lot of whitespace charcters.
What about this:
string x="
string_test";
One may interpret that string as: "\nxxxxxxxxxxxxstring_test" where x is a whitespace character.
Is it a convention?
That's called multiline string literal.
You need to escape the embedded newline. Otherwise, it will not compile:
std::cout << "Hello world \
and stackoverflow";
Note: Backslashes must be immediately before the line ends as they need to escape the newline in the source.
Also you can use the fun fact "Adjacent string literals are concatenated by the compiler" for your advantage by this:
std::cout << "Hello World"
"Stack overflow";
See this for raw string literals. In C++11, we have raw string literals. They are kind of like here-text.
Syntax:
prefix(optional) R"delimiter( raw_characters )delimiter"
It allows any character sequence, except that it must not contain the
closing sequence )delimiter". It is used to avoid escaping of any
character. Anything between the delimiters becomes part of the string.
const char* s1 = R"foo(
Hello
World
)foo";
Example taken from cppreference.

Matching of strings with special characters

I need to generate a string that can match another both containing special characters. I wrote what I thought would be a simple method, but so far nothing has given me a successful match.
I know that specials characters in c++ are preceded with a "\". Per example a single quote would be written as "\'".
string json_string(const string& incoming_str)
{
string str = "\\\"" + incoming_str + "\\\"";
return str;
}
And this is the string I have to compare to:
bool comp = json_string("hello world") == "\"hello world\"";
I can see in the cout stream that in fact I'm generating the string as needed but the comparison still gives a false value.
What am I missing? Any help would be appreciated.
One way is to filter one string and compare this filtered string. For example:
#include <iostream>
#include <algorithm>
using namespace std;
std::string filterBy(std::string unfiltered, std::string specialChars)
{
std::string filtered;
std::copy_if(unfiltered.begin(), unfiltered.end(),
std::back_inserter(filtered), [&specialChars](char c){return specialChars.find(c) == -1;});
return filtered;
}
int main() {
std::string specialChars = "\"";
std::string string1 = "test";
std::string string2 = "\"test\"";
std::cout << (string1 == filterBy(string2, specialChars) ? "match" : "no match");
return 0;
}
Output is match. This code also works if you add an arbitrary number of characters to specialChars.
If both strings contain special characters, you can also put string1 through the filterBy function. Then, something like:
"\"hello \" world \"" == "\"hello world "
will also match.
If the comparison is performance-critical, you might also have a comparison that uses two iterators, getting a comparison complexity of log(N+M), where N and M are the sizes of the two strings, respectively.
bool comp = json_string("hello world") == "\"hello world\"";
This will definitely yield false. You are creating string \"hello world\" by json_string("hello world") but comparing it to "hello world"
The problem is here:
string str = "\\\"" + incoming_str + "\\\"";
In your first string literal of str, the first character backlash that you’re assuming to be treated like escape character is not actually being treated an escape character, rather just a backslash in your string literal. You do the same in your last string literal.
Do this:
string str = "\"" + incoming_str + "\"";
In C++ string literals are delimited by quotes.
Then the problem arises: How can I define a string literal that does itself contain quotes? In Python (for comparison), this can get easy (but there are other drawbacks with this approach not of interest here): 'a string with " (quote)'.
C++ doesn't have this alternative string representation1, instead, you are limited to using escape sequences (which are available in Python, too – just for completeness...): Within a string (or character) literal (but nowhere else!), the sequence \" will be replaced by a single quote in the resulting string.
So "\"hello world\"" defined as character array would be:
{ '"', 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '"', 0 };
Note that now the escape character is not necessary...
Within your json_string function, you append additional backslashes, though:
"\\\""
{ '\', '"', 0 }
//^^^
Note that I wrote '\' just for illustration! How would you define single quote? By escaping again! '\'' – but now you need to escape the escape character, too, so a single backslash actually needs to be written as '\\' here (wheras in comparison, you don't have to escape the single quote in a string literal: "i am 'singly quoted'" – just as you didn't have to escape the double quote in the character literal).
As JSON uses double quotes for strings, too, you'd most likely want to change your function:
return "\"" + incoming_str + "\"";
or even much simpler:
return '"' + incoming_str + '"';
Now
json_string("hello world") == "\"hello world\""
would yield true...
1 Side note (stolen from answer deleted in the meanwhile): Since C++11, there are raw string literals, too. Using these, you don't have to escape either.

Remove spaces from string before period and comma

I could have a string like:
During this time , Bond meets a stunning IRS agent , whom he seduces .
I need to remove the extra spaces before the comma and before the period in my whole string. I tried throwing this into a char vector and only not push_back if the current char was " " and the following char was a "." or "," but it did not work. I know there is a simple way to do it maybe using trim(), find(), or erase() or some kind of regex but I am not the most familiar with regex.
A solution could be (using regex library):
std::string fix_string(const std::string& str) {
static const std::regex rgx_pattern("\\s+(?=[\\.,])");
std::string rtn;
rtn.reserve(str.size());
std::regex_replace(std::back_insert_iterator<std::string>(rtn),
str.cbegin(),
str.cend(),
rgx_pattern,
"");
return rtn;
}
This function takes in input a string and "fixes the spaces problem".
Here a demo
On a loop search for string " ," and if you find one replace that to ",":
std::string str = "...";
while( true ) {
auto pos = str.find( " ," );
if( pos == std::string::npos )
break;
str.replace( pos, 2, "," );
}
Do the same for " .". If you need to process different space symbols like tab use regex and proper group.
I don't know how to use regex for C++, also not sure if C++ supports PCRE regex, anyway I post this answer for the regex (I could delete it if it doesn't work for C++).
You can use this regex:
\s+(?=[,.])
Regex demo
First, there is no need to use a vector of char: you could very well do the same by using an std::string.
Then, your approach can't work because your copy is independent of the position of the space. Unfortunately you have to remove only spaces around the punctuation, and not those between words.
Modifying your code slightly you could delay copy of spaces waiting to the value of the first non-space: if it's not a punctuation you'd copy a space before the character, otherwise you just copy the non-space char (thus getting rid of spaces.
Similarly, once you've copied a punctuation just loop and ignore the following spaces until the first non-space char.
I could have written code. It would have been shorter. But i prefer letting you finish your homework with full understanding of the approach.

How to save " in a string in C++?

So I have the following code which doesn't work. I couldn't figure it out how to do it.
std::string str("Q850?51'18.23"");
First problem I face is " (quotation mark). I cannot save it as a string because at the end of the string I have two " characters and C++ doesn't let me save the whole string.
Second I want to split the string and save it in different variables.
E.g.;
double i = 850;
double j = 51;
double k = 18.23;
You will need to escape the quotation mark you require in the string;
std::string str("Q850?51'18.23\"");
// ^ escape the quote here
The cppreference site has a list of these escape sequences.
Alternatively you are use a raw string literal;
std::string str = R"(Q850?51'18.23")";
The second part of the problem is dependent on the format and predictability of the data;
If it is fixed width, a simple index and be used to extract the numbers and convert to the double you require.
If it is delimited with the characters above, you can consume the string to each of the delimiters extracting the numbers in-between them (you should be able to find suitable libraries to assist with this).
If it is some further unknown composition, you may be limited to consuming the string one character at a time and extracting the numerical values between the non-numerical values.
You need to escape your quote mark:
std::string str("Q850?51'18.23\"");
// ^
You need to escape your quote mark
Add a backslash before "
std::string str("Q850?51'18.23\"");

How to replace/remove a character in a character buffer?

I am trying to modify someone's code which uses this line:
out.write(&vecBuffer[0], x.length());
However, I want to modify the buffer beforehand so it removes any bad characters I don't want to be output. For example if the buffer is "Test%string" and I want to get rid of %, I want to change the buffer to "Test string" or "Teststring" whichever is easier.
std::replace will allow replacing one specific character with
another, e.g. '%' with ' '. Just call it normally:
std::replace( vecBuffer.begin(), vecBuffer.end(), '%', ' ' );
Replace the '%' with a predicate object, call replace_if,
and you can replace any character for which the predicate
object returns true. But always with the same character. For
more flexibility, there's std::transform, which you pass
a function which takes a char, and returns a char; it will
be called on each character in the buffer.
Alternatively, you can do something like:
vecBuffer.erase(
std::remove( vecBuffer.begin(), vecBuffer.end(), '%' ).
vecBuffer.end() );
To remove the characters. Here too, you can replace remove
with remove_if, and use a predicate, which may match many
different characters.
The simplest library you can use is probably the Boost String Algorithms library.
boost::replace_all(buffer, "%", "");
will replace all occurrences of % by nothing, in place. You could specify " " as a replacement, or even "REPLACEMENT", as suits you.
std::string str("Test string");
std::replace_if(str.begin(), str.end(), boost::is_any_of(" "), '');
std::cout << str << '\n';
You do not need to use the boost library. The easiest way is to replace the % character with a space, using std::replace() from the <algorithm> header:
std::replace(vecBuffer.begin(), vecBuffer.end(), '%', ' ');
I assume that vecBuffer, as its name implies, is an std::vector. If it's actually a plain array (or pointer), then you would do:
std::replace(vecBuffer, vecBuffer + SIZE_OF_BUFFER, '%', ' ');
SIZE_OF_BUFFER should be the size of the array (or the amount of characters in the array you want to process, if you don't want to convert the whole buffer.)
Assuming you have a function
bool goodChar( char c );
That returns true for characters you are approved of and false otherwise,
then how about
void fixBuf( char* buf, unsigned int len ) {
unsigned int co = 0;
for ( unsigned int cb = 0 ; cb < len ; cb++ ) {
if goodChar( buf[cb] ) {
buf[co] = buf[cb];
co++;
}
}
}