How to put quotation marks ("") into quotation marks? [duplicate] - c++

I have the following output created using a printf() statement:
printf("She said time flies like an arrow, but fruit flies like a banana.");
but I want to put the actual quotation in double-quotes, so the output is
She said "time flies like an arrow, but fruit flies like a banana".
without interfering with the double quotes used to wrap the string literal in the printf() statement.
How can I do this?

Escape the quotes with backslashes:
printf("She said \"time flies like an arrow, but fruit flies like a banana\".");
There are special escape characters that you can use in string literals, and these are denoted with a leading backslash.

Thankfully, with C++11 there is also the more pleasing approach of using raw string literals.
printf("She said \"time flies like an arrow, but fruit flies like a banana\".");
Becomes:
printf(R"(She said "time flies like an arrow, but fruit flies like a banana".)");
With respect to the addition of brackets after the opening quote, and before the closing quote, note that they can be almost any combination of up to 16 characters, helping avoid the situation where the combination is present in the string itself. Specifically:
any member of the basic source character set except: space, the left
parenthesis (, the right parenthesis ), the backslash , and the
control characters representing horizontal tab, vertical tab, form
feed, and newline" (N3936 §2.14.5 [lex.string] grammar) and "at most
16 characters" (§2.14.5/2)
How much clearer it makes this short strings might be debatable, but when used on longer formatted strings like HTML or JSON, it's unquestionably far clearer.

Related

how to run a cmd command that has "" inside? [duplicate]

I have the following output created using a printf() statement:
printf("She said time flies like an arrow, but fruit flies like a banana.");
but I want to put the actual quotation in double-quotes, so the output is
She said "time flies like an arrow, but fruit flies like a banana".
without interfering with the double quotes used to wrap the string literal in the printf() statement.
How can I do this?
Escape the quotes with backslashes:
printf("She said \"time flies like an arrow, but fruit flies like a banana\".");
There are special escape characters that you can use in string literals, and these are denoted with a leading backslash.
Thankfully, with C++11 there is also the more pleasing approach of using raw string literals.
printf("She said \"time flies like an arrow, but fruit flies like a banana\".");
Becomes:
printf(R"(She said "time flies like an arrow, but fruit flies like a banana".)");
With respect to the addition of brackets after the opening quote, and before the closing quote, note that they can be almost any combination of up to 16 characters, helping avoid the situation where the combination is present in the string itself. Specifically:
any member of the basic source character set except: space, the left
parenthesis (, the right parenthesis ), the backslash , and the
control characters representing horizontal tab, vertical tab, form
feed, and newline" (N3936 §2.14.5 [lex.string] grammar) and "at most
16 characters" (§2.14.5/2)
How much clearer it makes this short strings might be debatable, but when used on longer formatted strings like HTML or JSON, it's unquestionably far clearer.

How to remove brackets and quotes using SAS

I have a list of artists that is formatted like so:
['Justin Bieber']
['Brockhampton']
etc
and I want to make it so these variables no longer have quotes or brackets and instead look like:
Justin Bieber
Brockhampton
How would I do this?
Use the compress function.
=compress(artist, "[']");
The second argument adds both square brackets and the quotation mark to the list of characters to remove.
I'm doing this entirely from memory and it's years since I used SAS, so it might struggle with the quotation mark inside the quotation marks. You could also try
=compress(artist, '[]', 'p');
where the third argument adds all punctuation marks to the list of characters to remove.
Anyway, the compress function is what you want. Experiment with it if the exact arguments above don't quite work!

white space in Regular expression

I making use of this software, dk-brics-automaton to get number of states
of regular expressions. Now ,for example I have this type of RE:
^SEARCH\s+[^\n]{10}
When I insert it below as a string, the compiler say that invalid escape sequence
RegExp r = new RegExp("^SEARCH\s+[^\n]{10}", ALL);
where ALL is a certain FLAG
when I use double back slashes before small s, then the compiler accepts it
as a string where as over here \s means space but I am confused when I will make use of
double back slashes then it will consider just back slash and "s" where as I meant white space.
Now, I have thousands of such regular expressions for which I want to compute finite automaton
states.So, does that mean that I have to add manually back slashes in all the RE?
Here is a link where they have explained something related to this but I am not getting it:
http://www.brics.dk/automaton/doc/index.html
Please help me if anyone has some past experience in this software or if you have any idea to solve this issue.
I had another look at that documentation. "automaton" is a java package, therefor I think you have to treat them like java regexes. So just double every backslash inside a regex.
The thing here is, Java does not know "raw" strings. So you have to escape for two levels. The first level that evaluates escape sequences is the string level.
The string does not know an escape sequence \s, that is the error. \n is fine, the string evaluates it and stores instead the two characters \ (0x5C) and n (0x6E) the character 0x0A.
Then the string is stored and handed over to the regex constructor. Here happens the next round of escape sequence evaluation.
So if you want to escape for the regex level, then you have to double the backslashes. The string level will evaluate the \\ to \ and so the regex level gets the correct escape sequences.

How to make a regular expression looking for a list of extensions separated by a space

I want to be able to take a string of text from the user that should be formated like this:
.ext1 .ext2 .ext3 ...
Basically, I am looking for a dot, a string of alphanumeric characters of any length a space, and rinse and repeat. I am a little confused on how to say " i need a period, string of characters and a space". But also, the last extension could either be followed by nothing, or a space, or a series of spaces. Also, I guess in between extensions could be followed by any number of spaces?
EDIT: I made it clearer what I was looking for.
Thanks!
Try this:
^(?:\.[A-Za-z0-9]+ +)*\.[A-Za-z0-9]+ *$
(Rubular)
In a Java string literal you need to escape the backslashes:
"^(?:\\.[A-Za-z0-9]+ +)*\\.[A-Za-z0-9]+ *$"
(\.\w+)\s* Match this and get your results.
^((\.\w+)\s*)*$ Check this and if it's true, your String is exactly what you want.
For the last pattern thing, you can't (AFAIK) do both getting all extensions (separated) and checking that the last is followed by other things. Either you check your string, or you extract the extensions from it.
I'd start with something like: ^.[a-z0-9]+([\t\n\v ]+.[a-z0-9]+)*$

Regex matching spaces, but not in "strings"

I am looking for a regular exression matching spaces only if thos spaces are not enclosed in double quotes ("). For example, in
Mary had "a little lamb"
it should match the first an the second space, but not the others.
I want to split the string only at the spaces which are not in the double quotes, and not at the quotes.
I am using C++ with the Qt toolkit and wanted to use QString::split(QRegExp). QString is very similar to std::string and QRegExp are basically POSIX regex encapsulated in a class. If there exist such a regex, the split would be trivial.
Examples:
Mary had "a little lamb" => Mary,had,"a little lamb"
1" 2 "3 => 1" 2 "3 (no splitting at ")
abc def="g h i" "j k" = 12 => abc,def="g h i","j k",=,12
Sorry for the edits, I was very imprecise when I asked the question first. Hope it is somewhat more clear now.
(I know you just posted almost exactly the same answer yourself, but I can't bear to just throw all this away. :-/)
If it's possible to solve your problem with a regex split operation, the regex will have to match even numbers of quotation marks, as MSalters said. However, a split regex should match only the spaces you're splitting on, so the rest of the work has to be done in a lookahead. Here's what I would use:
" +(?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)"
If the text is well formed, a lookahead for an even number of quotes is sufficient to determine that the just-matched space is not inside a quoted sequence. That is, lookbehinds aren't necessary, which is good because QRegExp doesn't seem to support them. Escaped quotes can be accommodated too, but the regex becomes quite a bit larger and uglier. But if you can't be sure the text is well formed, it's extremely unlikely you'll be able to solve your problem with split().
By the way, QRegExp does not implement POSIX regular expressions--if it did, it wouldn't support lookaheads OR lookbehinds. Instead, it falls into the loosely-defined category of Perl-compatible regex flavors.
What should happen to "a" b "c" ?
Note that in the substring " b " the spaces are between quotes.
-- edit --
I assume a space is "between quotes" if it is preceded and followed by an odd number of standard quotation marks (i.e. U+0022, I'll ignore those funny Unicode “quotes”).
That means you need the following regex: ^[^"]*("[^"]*"[^"]*)*"[^"]* [^"]*"[^"]*("[^"]*"[^"]*)*$
("[^"]*"[^"]*) represents a pair of quotes. ("[^"]*"[^"]*)* is an even amount of quotes, ("[^"]"[^"]*)*" an odd amount. Then there's the actual quoted string part, followed by another odd number of quotes. ^$ anchors are needed because you need to count every quote from the beginning of the string. This answers the " b " substring problem above by never looking at substrings. The price is that every character in your input must be matched against the entire string, which turns this into an O(N*N) split operation.
The reason why you can do this in a regex is because there is a finite amount of memory needed. Effectively just one bit; "have I seen an odd or even number of quotes so far?". You don't actually have to match up individual "" pairs.
This is not the only interpretation possible, though. If you do include “funny Unicode quotes” which should be paired, you also need to deal with ““double quoted”” strings. This in turn means you need a count of open “, which means you need infinite storage, which in turns means it's no longer a regular language, which means you can't use a regex. QED.
Anyway, even if it was possible, you still would want a proper parser. The O(N*N) behavior to count the number of quotes preceding each character just isn't funny. If you already know there are X quotes preceding Str[N], it should be an O(1) operation to determine how many quotes precede Str[N+1], not O(N). The possible answers are after all just X or X+1 !
MSalters pushed me on the right track. The problem with his answer that the regex he gives always matches the whole string and so is unsuitable for split(), but this can partly redeemed by a lookahead match. Assuming that the quotes are always paired (they are indeed), I can split on every space which is followed by an even number of quotes.
The regex without C escapes and in single quotes looks like
' (?=[^"]*("[^"]*"[^"]*)*$)'
In the source it finally looked like (using Qt and C++)
QString buf("Mary had \"a little lamb\""); // string we want to split
QStringList splitted = buf.split( QRegExp(" (?=[^\"]*(\"[^\"]*\"[^\"]*)*$)") );
Simple, eh?
For the performance, the strings are parsed once at the start of the program, they are a few dozen and they are less than hundred chars. I will test its runtime with long strings, just to be sure nothing bad happens ;-)
If the quoting in the strings is simple (like your examples), you can use alternation. This regex first hunts for a simple quoted string; failing that it finds spaces.
/(\"[^\"]*\"| +)/
In Perl, if you use grouping in the regex when calling split(), the function returns not only the elements but also the captured groups (in this case, our delimiter). If you then filter out the blank and spaces-only delimiters, you will get the desired list of elements. I don't know whether a similar strategy would work in C++, but the following Perl code does work:
use strict;
use warnings;
while (<DATA>){
chomp;
my #elements = split /(\"[^\"]*\"| +)/, $_;
#elements = grep {length and /[^ ]/} #elements;
# Do stuff with #elements
}
__DATA__
Mary had "a little lamb"
1" 2 "3
abc def="g h i" "j k" = 12
Simplest regex-solution: match whole spaces AND quotes. Filter quotes later
"[^"]*"|\s